The Database Systems team specializes in high-performance distributed databases. Our team built Rockset, the real-time search, analytics, and vector database that powers all vector search and retrieval augmented generation (RAG) at OpenAI. In addition to retrieval, as an online database, Rockset powers core functionality across all of OpenAI's product lines and many critical internal use cases.
About the Role:
We are looking for engineers passionate about distributed systems, close-to-the-metal performance optimization (our core engine is written in C++), and building scalable database infrastructure from the ground up. As an engineer on the Database Systems team, you'll contribute to the core database engine, driving improvements across ingestion, query execution, indexing, and storage. You'll partner with teams across OpenAI to unlock new product capabilities and help scale online database reliability and throughput as usage grows by orders of magnitude.
In this role you will:
Design, build, and operate high-performance distributed systems
Identify and resolve performance bottlenecks to scale infrastructure to the next order of magnitude
Define long-term technical direction and guide system evolution
Collaborate with product, engineering, and research teams to deliver scalable and reliable infrastructure
Dig deep into complex production issues across the stack
Contribute to incident response, postmortems, and best practices for system reliability
Have significant experience building, scaling, and optimizing distributed systems at scale
Are curious about database internals, storage engines, or low-latency query systems
Enjoy debugging challenging performance issues in complex, high-throughput systems
Have experience operating production clusters at scale (e.g., Kubernetes or other orchestration systems)
Think rigorously about scalability, correctness, and reliability
Thrive in fast-paced environments with high autonomy and impact
Qualifications:
4+ years of relevant industry experience, with 2+ years leading large scale, complex projects or teams as an engineer or tech lead
Experience with distributed systems at scale, with a strong focus on performance, reliability, and scalability
Strong communication skills and ability to collaborate across highly technical and cross-functional teams
Proficiency in a systems programming language such as C++ (our core engine is written in C++) is strongly preferred
Fluency in cloud environments (AWS, GCP, Azure) and IaC tools (Terraform or similar)
Experience with Linux systems, CI/CD pipelines, and modern observability stacks (Prometheus, Grafana, etc.)
Domain knowledge in areas such as databases, data systems, storage engines, indexing, and query processing is a plus but not required