Position Details
About this role
This role involves maintaining and improving the reliability and scalability of internal services and platforms at Braze, leveraging automation and modern infrastructure tools.
Key Responsibilities
- Maintain system uptime
- Automate infrastructure deployment
- Troubleshoot scalability issues
- Develop monitoring and alerting
- Collaborate with engineering teams
Technical Overview
The candidate will work with infrastructure as code, container orchestration, distributed systems, and monitoring tools to ensure high availability and performance.
Ideal Candidate
The ideal candidate is a mid-level Site Reliability Engineer with experience in infrastructure automation, container orchestration, and scalable system design. They are proficient with tools like Kubernetes, Terraform, and Docker, and have a strong background in maintaining high-availability systems.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with infrastructure as code, No experience with Kubernetes or Docker, Inability to troubleshoot distributed systems, No background in system reliability engineering, Unwillingness to work onsite in San Francisco
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile