Position Details
About this role
Site Reliability Engineer (SRE) to join the Cloud Infrastructure team to improve reliability and scalability of self-service platforms, with a focus on Kubernetes-based workloads and modern CI/CD practices.
Key Responsibilities
- Collaborate with internal customers and partners to deliver key business outcomes
- Ensure cloud products are reliable, scalable, and secure
- Enhance observability across cloud services
- Respond to cloud incidents with root cause analysis
- Drive CI/CD improvements
Technical Overview
Stack emphasizes Go and Python programming, Kubernetes and Kubernetes controllers, RESTful APIs, CI/CD, and observability to support large-scale cloud services.
Ideal Candidate
The ideal candidate is a mid-level SRE with 3+ years of cloud-native experience, strong Go/Python skills, and solid Kubernetes expertise including controllers. They should excel in observability, incident response, and CI/CD improvements within a fast-paced, open-source-friendly environment.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Certifications
Preferred
Industry & Role
Keywords for Your Resume
Deal Breakers
3+ years of Go or Python, Kubernetes experience, Experience with Kubernetes controllers
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile