Position Details
About this role
Lead a team responsible for enterprise observability platforms and core development tooling to improve detect, diagnose, and resolve production issues at scale.
Key Responsibilities
- Lead & develop the team: hire, coach, set priorities, and build a culture of reliability, ownership, learning, and continuous improvement
- Set observability direction: define vision/roadmap and standards for metrics, logs, traces, alerting, dashboards, and service health; promote early instrumentation and SLI/SLO-based practices
- Own DevOps Tools & CI/CD platforms: governance for Azure DevOps, GitHub Enterprise, Jenkins; define guardrails for repos, branching, pipelines, build agents
- Drive reliability & incident excellence: partner on incident response, RCA, post-mortems; improve detection, triage, rollback, recovery, on-call readiness
- Platform engineering & automation: manage platforms via Infrastructure as Code; standardize configurations and operational practices
Technical Overview
Cloud-native architecture focus with IaC, CI/CD tooling, and a suite of observability platforms; leadership of a DevOps/SRE/Platform Tools team.
Ideal Candidate
The ideal candidate is an experienced observability/DevOps leader with hands-on cloud-native and IaC expertise, able to drive reliability initiatives and manage enterprise tooling.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Must have leadership experience in Observability/SRE/DevOps, Strong hands-on background with listed observability tools
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile