Position Details
About this role
This role involves designing and maintaining scalable, reliable infrastructure systems, leading incident response efforts, and implementing automation to support mission-critical production services.
Key Responsibilities
- Design scalable infrastructure
- Implement automated deployment pipelines
- Lead incident response
- Develop and monitor SLOs and error budgets
- Improve system reliability
Technical Overview
Focus on infrastructure automation, deployment pipelines, observability, disaster recovery, and distributed systems at scale, with a strong emphasis on system reliability and incident management.
Ideal Candidate
The ideal candidate is a senior SRE with over 7 years of experience in designing scalable infrastructure, automating deployment pipelines, and managing system reliability. They are skilled in incident response and implementing observability solutions.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Less than 7 years of relevant experience, Lack of experience with infrastructure automation, No background in incident response or system reliability, Unwillingness to work in a hybrid environment
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile