Related jobs

Google DeepMind›Research Scientist›

Research Scientist
London

›

Fast-track your ML job hunt :

Be the first to hear about new sota jobs
+ exclusive salary research + career cheatsheets.

Google DeepMind · London

Research Scientist, Science of Post-Training and Reinforcement Learning

3/4/2026

Description

You will work closely with Ian Osband and the team on research around post-training for agents and LLMs, including practical RL methods and evaluation. This is not a theory-only role; you should expect to implement code, run experiments, and own results end-to-end. Success in this role is defined by whether the team learns faster and whether the work produced is crisp, honest, and high-quality.

Key Responsibilities

Propose and test research hypotheses in post-training and RL for agents/LLMs.
Implement algorithm ideas and run end-to-end experiments, including setup, execution, analysis, and iteration.
Design evaluations and ablations that answer real questions and change minds.
Analyze results carefully, including debugging and failure analysis.
Communicate clearly through plots, writeups, and paper-ready narratives and figures.
Collaborate closely with engineering and research partners to keep the team aligned on findings and strategy.
Contribute to a culture of first-principles thinking, high standards, and direct, constructive feedback.

Qualifications

A research track record in ML/RL, demonstrated through publications or high-quality projects.
Strong implementation ability and comfort working in research codebases.
Evidence of owning experiments end-to-end, including analysis and interpretation.
Strong communication skills and a bias toward clarity and honesty regarding results.
High agency and drive: You push projects forward, prioritize effectively, and take initiative.
PhD in ML preferred, or equivalent practical experience.

Experience with RL for sequence models, post-training, preference-based learning, or agentic systems.
Experience with modern research stacks (e.g., JAX/Flax or PyTorch) and scaling experiments.
Strong experimental taste: Good judgment regarding baselines, ablations, and what is worth testing.
Comfort with scaling, evaluation methodologies, and diagnosing complex failure modes.
A focus on craft: You care about doing excellent work while maintaining a high velocity.

Application

View listing at origin and apply!

Related jobs

Google DeepMind›Research Scientist›

Research Scientist
London

›

Fast-track your ML job hunt :

Be the first to hear about new sota jobs + exclusive salary research + career cheatsheets.

About

Related jobs

Research Scientist, Science of Post-Training and Reinforcement Learning

Description

Qualifications

Application

Related jobs