• 5+ years of successful experience in a similar DX / DevOps / SRE role.
• Proficiency in software development (Python, Go...) and programming best practices.
• Exposure to site reliability engineering: root cause analysis, in-production troubleshooting, on-call rotations...)
• Exposure to infrastructure management: CI/CD, containerization, orchestration, infra-as-code, monitoring, logging, alerting, observability...).
• Technical product mindset (e.g. understanding how to debug poor adoption).
• Excellent problem-solving and communication skills (ability to contextualizing, gauging risks and getting buy-in for high stakes and impactful solutions).
• Ownership, high agency and constantly seeking to learn and improving things for others.
• Autonomous, self-driven and able to work well in a fast-paced startup environment.
• Low ego and team spirit mindset.
Your application will be all the more interesting if you also have:
• First hand Bazel (or equivalent) experience.
• Strong knowledge of Python's ecosystem.
• Familiarity with GPU based workloads and ecosystems.
• Experience of full remote environments (you're comfortable with having some of your users on the other side of the globe).