Position Details

Type Not Specified

Experience senior

Exp. Years 7+ years

Education Bachelor's degree in relevant field(s) or equivalent

Category DevOps & SRE

About this role

This role involves designing and maintaining scalable, reliable infrastructure systems, leading incident response efforts, and implementing automation to support mission-critical production services.

Key Responsibilities

Design scalable infrastructure
Implement automated deployment pipelines
Lead incident response
Develop and monitor SLOs and error budgets
Improve system reliability

Technical Overview

Focus on infrastructure automation, deployment pipelines, observability, disaster recovery, and distributed systems at scale, with a strong emphasis on system reliability and incident management.

Ideal Candidate

The ideal candidate is a senior SRE with over 7 years of experience in designing scalable infrastructure, automating deployment pipelines, and managing system reliability. They are skilled in incident response and implementing observability solutions.

Must-Have Skills

7+ years of relevant experienceInfrastructure designAutomationIncident responseSystem reliability

Nice-to-Have Skills

SLOs and error budgetsDistributed systemsDevOps practicesMonitoring tools

Tools & Platforms

Automation toolsMonitoring platforms

Required Skills

Infrastructure systemsAutomated deployment pipelinesObservability platformsDisaster recoveryService level objectivesError budgetsAutomation toolsDistributed systemsIncident responseSystem reliability

Hard Skills

Infrastructure systemsAutomated deployment pipelinesObservability platformsDisaster recoveryService level objectivesError budgetsAutomation toolsDistributed systemsIncident responseSystem reliability

Soft Skills

Problem-solvingCollaborationAnalytical thinkingCommunicationLeadership

Industry & Role

Industry Financial Services

Job Function Maintain and improve scalable, reliable infrastructure systems supporting critical production services

Keywords for Your Resume

Site Reliability EngineerSREInfrastructure systemsAutomated deployment pipelinesObservability platformsDisaster recoveryService level objectivesError budgetsDistributed systemsIncident responseSystem reliabilityDevOpsAutomationMonitoringTroubleshootingDeployment pipelines

Deal Breakers

Less than 7 years of relevant experience, Lack of experience with infrastructure automation, No background in incident response or system reliability, Unwillingness to work in a hybrid environment

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Principal Site Reliability Engineer (SRE)

Get matched to jobs like this