As a Societal Impacts research scientist on the Models Research Pod, you'll close the loop between observing Claude's behavior and improving it at the model level. You'll use observational tools like Clio to analyze real-world usage patterns and build evaluations that assess whether Claude provides safe responses aligned with its Constitution.
Strong candidates will have experience with machine learning systems and a genuine interest in societal impacts research. You should be adaptable and excited to contribute to evolving team priorities rather than coming in with a fixed agenda. The role is highly cross-functional, with regular collaboration across the fine-tuning, safeguards, policy, and interpretability teams.
We're hiring at both junior and senior levels. Senior researchers should be comfortable doing hands-on technical work alongside helping set research direction.
Clio: A system for privacy-preserving insights into real-world AI use
How People Use Claude for Support, Advice, and Companionship
The Capacity for Moral Self-Correction in Large Language Models
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Towards Measuring the Representation of Subjective Global Opinions in Language Models
Collective Constitutional AI: Aligning a Language Model with Public Input