OpenAI · San Francisco · Hybrid

Machine Learning Engineer, Distributed Data Systems - Robotics

2/6/2026

Description

Our mission is to expand the capabilities of foundational models to support general-purpose robotics in dynamic, real-world environments, ensuring reliable and safe operation. These capabilities include, but are not limited to, action generation, motion planning, world modeling, and real-time communication through voice and emotions.

We work across the entire robotics stack, integrating cutting-edge hardware, sensors, end-effectors, and models to explore a diverse range of robotic form factors. Our goal is to develop models that provide state-of-the-art intelligence combined with seamless physical skills under the practical constraints of robotic platforms.

About the Role

As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements into robust systems, and harden pipelines that serve as the backbone for OpenAI's rapid iteration cycles.

We’re looking for engineers who are detail-oriented, have strong experience with distributed systems, and excel at building reliable infrastructure in high-stakes environments.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and security.

  • Ensure our data platform can scale by orders of magnitude while remaining reliable and efficient.

  • Partner with researchers to deeply understand requirements and translate them into production-ready systems.

  • Harden, optimize, and maintain critical data infrastructure systems that power multimodal training and evaluation.

Qualifications

  • Have strong experience with distributed systems and large-scale infrastructure with a strong interest in data.

  • Are detail-oriented and bring rigor to building and maintaining reliable systems.

  • Demonstrate excellent software engineering fundamentals and organizational skills.

  • Are comfortable with ambiguity and rapid change.

Application

View listing at origin and apply!

Fast-track your ML job hunt :

Be the first to hear about new sota jobs + exclusive salary research + career cheatsheets.