As a Research Scientist/Engineer on the Alignment Finetuning team at Anthropic, you'll lead the development and implementation of techniques aimed at training language models that are more aligned with human values: that demonstrate better moral reasoning, improved honesty, and good character. You'll work to develop novel finetuning techniques and to use these to demonstrably improve model behavior.
Note: For this role, we conduct all interviews in Python. We pacing our team growth so you may not hear back on your application to this team unless we see an unusually strong fit.