Description
The successful candidate will join a highly cross-functional team dedicated to product software, back-end infrastructure, and tooling that supports our machine learning initiatives. Our team collaborates with a wide range of internal teams across the company, as well as developers and external partners. This role is focused on the development and deployment of innovative systems and methods for data science and synthesis, while staying current with the latest advancements in these fields.
Key responsibilities include:
* Developing strategies and algorithms for mining large amounts of data numbering in the billions for the purposes of targeted model training.
* Addressing challenges in automatic evaluation of generative results, identification and classification of failure cases, and strategies for assessing their prevalence and severity.
* Streamlining human-in-the-loop processes for dataset construction, and creating and implementing systems that execute on strategies to account for subjectivity and human error.
* Synthesis of training data as well as synthesis of augmentations to real-world data.