Description
The successful candidate will be joining a highly cross functional team focused on product software, back-end infrastructure, and tooling in support of our machine learning efforts. Our team interfaces with multiple internal teams from across the company, as well as developers and external partners.
This role focuses on growing and leading a team dedicated to enhancing our data science and synthesis capabilities. This includes staying informed about the latest industry advancements, mentoring team members, and fostering collaboration with partner teams to develop and deliver impactful solutions. The team’s key areas of focus include:
- Developing strategies and algorithms for mining large amounts of data numbering in the billions for the purposes of targeted model training.
- Addressing challenges in automatic evaluation of generative results, identification and classification of failure cases, and strategies for assessing their prevalence and severity.
- Streamlining human-in-the-loop processes for dataset construction, and creating and implementing systems that execute on strategies to account for subjectivity and human error.
- Synthesis of training data as well as synthesis of augmentations to real-world data.