Description
As a staff software engineer, you have the following responsibilities:
- Own the architecture, design, development, and operations of large-scale systems designed for machine learning.
- Develop custom scheduling, resource management solutions, and fleet management for our ML model training compute infrastructure.
- Collaborate with multi-functional teams, integrate with Kubernetes in on-premises and cloud provider clusters, and enable seamless integration with NVIDIA GPUs and other ML accelerators.
- Partner with data scientists and machine learning engineers across different Apple organizations to define high-impact product features and deliver them with quality. In this role, you are building the platform upon which other teams will develop data pipelines and machine learning applications.
- Lead a group of engineers to deliver high-quality products/services. Be able to stay on top of innovative technologies and apply them in the job.
- During the process, support junior engineers by providing advice, mentoring, and educational opportunities.