Position Details

Type Full-Time

Experience mid

Exp. Years Not specified

Education Not specified

Category DevOps & SRE

About this role

This role involves maintaining and improving the reliability and scalability of internal services and platforms at Braze, leveraging automation and modern infrastructure tools.

Key Responsibilities

Maintain system uptime
Automate infrastructure deployment
Troubleshoot scalability issues
Develop monitoring and alerting
Collaborate with engineering teams

Technical Overview

The candidate will work with infrastructure as code, container orchestration, distributed systems, and monitoring tools to ensure high availability and performance.

Ideal Candidate

The ideal candidate is a mid-level Site Reliability Engineer with experience in infrastructure automation, container orchestration, and scalable system design. They are proficient with tools like Kubernetes, Terraform, and Docker, and have a strong background in maintaining high-availability systems.

Must-Have Skills

experience with infrastructure as codeknowledge of Kubernetes and Dockersystem reliability experienceability to troubleshoot scalability issuesexperience with monitoring and alerting

Nice-to-Have Skills

experience with Ruby on RailsMongoDBRedisKafkacloud infrastructure

Tools & Platforms

ChefTerraformKubernetesDockerRuby on RailsMongoDBRedisKafka

Required Skills

Site Reliability EngineerSREinfrastructure as codeChefTerraformKubernetesDockerLinuxdistributed systemsscaling algorithmsmonitoringalertingautomationRuby on RailsMongoDBRedisKafka

Hard Skills

Site Reliability EngineeringSREinfrastructure as codeChefTerraformKubernetesDockerLinuxsystem administrationdistributed systemsscaling algorithmsmonitoringalertingautomationRuby on RailsMongoDBRedisKafka

Soft Skills

collaborationproblem-solvingautomationcommunicationteamworkadaptability

Industry & Role

Industry SaaS

Job Function Ensure infrastructure reliability and scalability through automation and system engineering

Keywords for Your Resume

Site Reliability EngineerSREinfrastructure as codeChefTerraformKubernetesDockerLinuxdistributed systemsscaling algorithmsmonitoringalertingautomationRuby on RailsMongoDBRedisKafka

Deal Breakers

Lack of experience with infrastructure as code, No experience with Kubernetes or Docker, Inability to troubleshoot distributed systems, No background in system reliability engineering, Unwillingness to work onsite in San Francisco

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Senior Site Reliability Engineer

Get matched to jobs like this