Description
This role involves using generative models and optimization tools and methods to produce diverse and representative data when real-world data is scarce, sensitive, or prohibitively expensive to collect. This researcher collaborates closely with other ML scientists and data scientists to understand data requirements, optimize generation pipelines, and ensure the synthetic data supports model generalization and robustness.
In addition to research and technical implementation, the role also requires developing quality control mechanisms, such as human-in-the-loop feedback loops, to ensure that synthetic datasets are valid. This work will support applications in diverse domains.