OpenAI · San Francisco · Hybrid

Data Scientist, Infrastructure

7/22/2025

Description

Our infrastructure team helps deliver OpenAI’s most capable models and products to the world by scaling infrastructure and turning demand into useful FLOPS. We collaborate across research, engineering, design, and business to turn cutting-edge AI advancements into impactful, real-world applications. Our team ensures the right compute is available—at the right time and place—to support some of the world’s most demanding workloads. We empower all of OpenAI’s products and research by scaling the infrastructure behind them. Our work makes it possible to launch new models and products reliably and at scale.

About the Role

As a Data Scientist on the Infra team, you will play a key role in shaping how we scale the infrastructure that powers OpenAI’s products and research. This is critical as we operate one of the largest and most advanced compute fleets in the world, supporting millions of users and businesses globally. We focus on aligning infrastructure measurement, planning, scaling, allocation, and efficiency to drive measurable impact across the company.

You should expect to guide the definition of foundational datasets for infrastructure resources, develop metrics that inform key decisions, build forecasting and optimization models, and establish source of truth dashboards and analyses that enable teams to understand and improve infra usage. Most importantly, you should expect to be a core partner to engineering, research, and product teams in shaping the infrastructure that powers everything OpenAI builds.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

  • Build and maintain foundational datasets and metrics that reflect infrastructure usage, efficiency, and scaling.

  • Develop forecasting and optimization models to support infra planning and resource allocation.

  • Partner with engineering, research, and product teams to shape infrastructure strategy through data.

  • Drive clarity with source-of-truth dashboards and analyses that guide infra decisions across OpenAI.

Qualifications

  • 5+ years of experience in a quantitative role navigating ambiguous environments, ideally in infrastructure, systems, or platform domains at a high-growth company or research org

  • Experience defining and operationalizing metrics that reflect system performance, resource usage, or efficiency from the ground up

  • A strong foundation in SQL and Python, and a track record of building models and analyses that drive technical and strategic decisions

  • Excellent communication skills and the ability to partner effectively with engineers, researchers, and product stakeholders

  • A strategic mindset that goes beyond statistical testing to surface actionable insights and long-term tradeoffs

You could be an especially great fit if you have:

  • Proven track record of operating as a data partner in large scale backend systems

  • Comfortable navigating fast-paced execution while also anchoring decisions in long-term impact

  • Strong programming background, with ability to run simulations and prototype variants

  • Experience in NLP, large language models, or generative AI

Application

View listing at origin and apply!