Position Details
About this role
This role involves leading reliability efforts for Okta's SaaS platform, focusing on maintaining system uptime, scalability, and automation.
Key Responsibilities
- Drive technical reliability strategy
- Manage on-call rotations
- Implement monitoring and automation solutions
- Lead incident response and troubleshooting
- Ensure scalability and performance
Technical Overview
The technical environment includes cloud infrastructure, monitoring tools, incident response, and automation with tools like Kubernetes and AWS.
Ideal Candidate
The ideal candidate is a mid-level Site Reliability Engineer with 3+ years of experience in cloud infrastructure, monitoring, and incident management. They possess strong leadership skills and a deep understanding of automation and scalability best practices.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with SRE or cloud infrastructure, No experience with incident management or automation, Unwillingness to work onsite in Barcelona
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile