Position Details
About this role
Senior Principal Site Reliability Engineer to design, implement, and drive observability, automation, and reliability initiatives across large distributed systems in a financial services environment.
Key Responsibilities
- Lead reliability-focused initiatives
- Define SLI/SLOs and error budgets
- Build automated remediation from observability signals
- Improve MTTR/MTTD and reduce toil
- Mentor engineering teams
Technical Overview
Focus on SRE principles, IaC, CI/CD, containerization with Kubernetes/Docker, monitoring/observability, and performance under load; emphasis on cross-functional collaboration.
Ideal Candidate
The ideal candidate is a senior-level SRE/DevOps leader with 15+ years of systems engineering experience, expert in Python/Go/Java/Ruby, and hands-on in Kubernetes/Docker, CI/CD, and observability practices to improve reliability at scale.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
15+ years of experience in relevant domains, 7+ years in technical leadership, Proven ability to design and operate in hybrid on-prem/in-cloud environments
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile