Work on post-training evaluation and fine-tuning of large-scale models to improve performance and safety.
Define and champion the technical roadmap for large-scale data and evaluation supporting the Gemini model family and its real-world applications
Drive the research of novel, high-signal evaluation methods (automated, human-in-the-loop, and adversarial) to measure model capabilities, alignment, safety, and trustworthiness.
Actively contribute to the broader scientific community by presenting findings on cutting-edge AI evaluation and safety methods.
Qualifications
10+ years of experience in researching engineering, with at least 5 years in a technical leadership role.
Experience with large-scale machine learning systems, data processing pipelines and evaluation methodologies.
Experience with large language models (LLMs) and their evaluation.