Position Details
About this role
Senior software engineer in reliability engineering focused on building and maintaining production services with strong emphasis on observability, automation, and on-call incident response.
Key Responsibilities
- Improve observability, reliability and availability by defining and measuring key metrics
- Build automation and improve systems to eliminate toil and operations work
- Collaborate with core infrastructure to performance tune and optimize cloud deployments
- Automate incident response and reduce service disruptions
- Participate in on-call support rotation
Technical Overview
Hands-on with containerization and cloud platforms, including Docker, Kubernetes, Terraform, AWS/GCP/Azure; strong observability tooling (Datadog, Kibana) and incident response readiness.
Ideal Candidate
The ideal candidate is a senior site reliability engineer with 6+ years building and operating production services, strong observability and debugging skills, and experience across AWS/GCP/Azure. They should be comfortable with on-call rotations and thrive in a fast-paced, regulated fintech environment.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Must be able to participate in on-call rotations
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile