Position Details
About this role
This role involves supporting the operational stability and reliability of customer-facing digital platforms through incident management, monitoring, and cloud integration.
Key Responsibilities
- Lead incident management and escalation
- Ensure platform stability and resilience
- Collaborate with stakeholders
- Implement monitoring and observability
- Support deployment and upgrades
Technical Overview
The position requires expertise in software engineering, site reliability engineering, cloud platforms like AWS, Azure, GCP, and tools such as Kubernetes and Docker, with a focus on incident response and operational resilience.
Ideal Candidate
The ideal candidate is a mid-level software engineer with 4+ years of experience in site reliability engineering, incident management, and cloud platforms. They possess strong collaboration and problem-solving skills, with a focus on operational stability and platform resilience.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Less than 4 years of experience, Lack of incident management experience, No cloud platform experience, No familiarity with ITIL
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile