OpenAI · San Francisco · Hybrid

Research Engineer, Codex

8/5/2025

Description

The Codex team is responsible for building state-of-the-art AI systems that can write code, reason about software, and act as intelligent agents for developers and non-developers alike. Our mission is to push the frontier of code generation and agentic reasoning, and deploy these capabilities in real-world products such as ChatGPT and the API, as well as in next-generation tools specifically designed for agentic coding. We operate across research, engineering, product, and infrastructure—owning the full lifecycle of experimentation, deployment, and iteration on novel coding capabilities.

About the Role

As a member of the Codex team, you will advance the capabilities, performance, and reliability of AI coding models through a combination of research, experimentation, and system optimization. You’ll collaborate with world-class researchers and engineers to develop and deploy systems that help millions of users write better code, faster—while also ensuring these systems are efficient, cost-effective, and production-ready.

We’re looking for people who combine deep curiosity, strong technical fundamentals, and a bias toward impact. Whether your strengths lie in ML research, systems engineering, or performance optimization, you’ll play a pivotal role in pushing the state of the art and bringing these advances into the hands of real users.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you might:

  • Design and run experiments to improve code generation, reasoning, and agentic behavior in Codex models.

  • Develop research insights into model training, alignment, and evaluation.

  • Hunt down and address inefficiencies across the Codex system stack—from agent behavior to LLM inference to container orchestration—and land high-leverage performance improvements.
    Build tooling to measure, profile, and optimize system performance at scale.

  • Work across the stack to prototype new capabilities, debug complex issues, and ship improvements to production.

Qualifications

  • Are excited to explore and push the boundaries of large language models, especially in the domain of software reasoning and code generation.

  • Have strong software engineering skills and enjoy quickly turning ideas into working prototypes.

  • Think holistically about performance, balancing speed, cost, and user experience.

  • Bring creativity and rigor to open-ended research problems and thrive in highly iterative, ambiguous environments.

  • Have experience operating across both ML systems and cloud infrastructure.

Application

View listing at origin and apply!