Position Details

Salary $150K – $175K USD / year

Type Full-Time

Experience senior

Exp. Years 10+ years

Education Not specified

Category DevOps & SRE

About this role

This role involves leading the development and implementation of site reliability engineering practices, focusing on monitoring, performance, and automation across cloud platforms like Azure, AWS, and GCP, to ensure system stability and scalability.

Key Responsibilities

Build SRE practices and patterns
Monitor and improve system performance
Lead automation and infrastructure as code initiatives
Mentor SRE teams
Collaborate with architecture and engineering teams

Technical Overview

The technical environment includes Java, Python, Go, Perl, Ruby, shell scripting, Kubernetes, cloud platforms (Azure, AWS, GCP), CI/CD pipelines, and observability tools, emphasizing scalable, reliable infrastructure.

Ideal Candidate

The ideal candidate is a senior SRE with over 10 years of experience in enterprise software development, proficient in multiple programming languages and cloud platforms. They should have strong leadership skills, experience in monitoring and performance engineering, and a passion for building reliable, scalable systems.

Must-Have Skills

10+ years of experience in enterprise software developmentProficiency in multiple languages (JavaPythonGoPerlRubyshell scripting)5+ years in implementing SRE practicesExperience with monitoring and performance engineeringExperience with cloud platforms (AzureAWSGCP)Experience with KubernetesCI/CD pipelinesAutomation and Infrastructure as Code

Nice-to-Have Skills

Experience in creating reusable patternsExperience in attracting and mentoring SRE teamsKnowledge of observability tools

Tools & Platforms

AzureAWSGoogle Cloud PlatformKubernetesCI/CD toolsMonitoring tools

Required Skills

JavaPythonGoPerlRubyShell scriptingSite Reliability EngineeringMonitoringPerformance EngineeringCloud ComputingAzureAWSGCPKubernetesCI/CDAutomationInfrastructure as Code

Hard Skills

JavaPythonGoPerlRubyShell scriptingSite Reliability EngineeringMonitoringPerformance EngineeringCloud ComputingAzureAWSGCPKubernetesCI/CDAutomationInfrastructure as Code

Soft Skills

LeadershipMentorshipCollaborationCommunicationProblem-solvingContinuous learningEmpathyHumility

Industry & Role

Industry Healthcare & Medical

Job Function Building and maintaining reliable, scalable cloud infrastructure with a focus on monitoring and automation

Role Subtype Site Reliability Engineer

Tech Domains Azure, Amazon Web Services, Google Cloud Platform, Kubernetes

Keywords for Your Resume

site reliability engineerSREcloud engineeringAzureAWSGoogle Cloud PlatformKubernetesmonitoringperformance engineeringautomationinfrastructure as codeCI/CDenterprise softwareleadershipmentorshipobservabilityreusable patternscloud platformsGCP

Deal Breakers

Less than 10 years of experience, Lack of experience with cloud platforms (Azure, AWS, GCP), No experience with Kubernetes, No background in enterprise software development

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Site Reliability Engineer- Principal - Epic

Get matched to jobs like this