Position Details
About this role
This role involves leading the development and implementation of reliable, scalable infrastructure systems using SRE best practices, automation, and observability tools to ensure high system availability and performance.
Key Responsibilities
- Designing system architectures
- Leading automation initiatives
- Monitoring and troubleshooting system issues
- Mentoring engineering teams
- Implementing security and reliability improvements
Technical Overview
The technical environment includes cloud platforms like AWS, Azure, GCP, containerization with Kubernetes and Docker, infrastructure as code with Terraform, and monitoring with Prometheus and Grafana.
Ideal Candidate
The ideal candidate is a mid-level Site Reliability Engineer with 6+ years of experience in cloud infrastructure, observability, and automation. They possess strong leadership skills and expertise in designing scalable, reliable systems using SRE practices.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Certifications
Required
Preferred
Industry & Role
Keywords for Your Resume
Deal Breakers
No experience with SRE practices, Lack of cloud infrastructure knowledge, Less than 6 years of relevant experience, No experience with monitoring tools, Unwillingness to work in a hybrid environment
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile