Apple · Sunnyvale

Sr. Cloud Ops / SRE, Applied Machine Learning

(w/m/d) · 8.1.2025

Description

Join Apple's Applied Machine Learning Team, as a Senior Software Engineer, to build & support innovative software applications. Candidates should have strong background in setting up and supporting the infrastructure for large scale big data applications in public cloud like AWS or GCP. THE MAIN RESPONSIBILITIES FOR THIS POSITION INCLUDE: Build & Support CI/CD tools to port & manage applications on AWS/GCP & Kubernetes Ability to understand the application requirements (Performance, Security, Scalability etc.) and assess the right services/topology on AWS/GCP & Kubernetes Deploy & Support applications onto Kubernetes based environments - On-prim K8s/AWS EKS/GCP GKE. Build automation to enable self-healing systems Build tools to monitor high performance & alert the low latency applications on AWS/GCP Ability to troubleshoot application specific, core network, system & performance issues. Involvement in challenging and fast paced projects supporting Apple's business by delivering innovative solutions. Monitor production, staging, test and development environments for a myriad of applications in an agile and dynamic organization. The candidate is expected to be self-motivated, proactive, and a solution-oriented individual.

Qualifications

  • 5+ years of experience in SRE/DevOps
  • Bachelors with 4+ years or MS plus 2+ years experience or related experience.
  • Strong programming skills in Unix & Python
  • Extensive experience in managing the applications on AWS/GCP & Kubernetes
  • Strong Experience in Infrastructure templating tools like CloudFormation, Terraform
  • Strong proficiency with Helm or Kustomize for managing Kubernetes applications and configurations.

Preferred Qualifications

  • BS in computer science with 4+ years or MS plus 2+ years experience or related experience.
  • Practical understanding of Networking concepts on Cloud, like VPCs, DNS, Security Groups, Kubernetes network model
  • Experience in building CI/CD pipelines for large scale application on AWS/GCP & Kubernetes
  • Experience in GitOps based deployment tools like Spinnaker/Flux/ArgoCD
  • Experience in enabling AutoScaling for both VM & Containerized workloads
  • Deep understanding of Object Oriented Programming skills like Java.
  • Experience in Performance tuning JVMs & Operating Systems like Linux
  • Good understanding of Data Security on Cloud based applications
  • Excellent analytical & problem solving skills
  • Exposure to Large Language Models, Vector Databases, RAG, GenAI platforms like AWS Bedrock or GCP Vertex AI are preferred.

Benefits

Application

View listing at origin and apply!