Position Details
About this role
This role involves leading platform reliability projects using SRE practices, developing monitoring and observability solutions, and troubleshooting system issues to ensure high availability.
Key Responsibilities
- Lead platform reliability projects
- Implement monitoring and observability
- Automate incident response
- Troubleshoot system disruptions
- Develop runbooks and perform root cause analysis
Technical Overview
The technical environment includes SRE methodologies, monitoring tools, automation pipelines, and incident management processes.
Ideal Candidate
The ideal candidate is a mid-level Site Reliability Engineer with 4+ years of experience in monitoring, automation, and incident management, capable of leading system reliability initiatives and troubleshooting complex issues.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with SRE practices, No background in monitoring or automation, Unable to work remotely in US
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile