About this role
Senior Reliability Engineer responsible for the reliability and resilience of JetBlue's critical infrastructure; leads incident response, defines SLOs/SLIs, and drives toil-reduction through automation.
Key Responsibilities
- Own reliability outcomes for infrastructure; Define SLIs/SLOs; Lead incident response as Incident Commander; Participate in 24x7 on-call; Improve monitoring and automation
Technical Overview
Stacks include Linux, Kubernetes, and Azure cloud; emphasizes observability, IaC, and automation to improve reliability and incident response.
Ideal Candidate
The ideal candidate is a senior site reliability engineer with 5+ years of experience in infrastructure, cloud, and SRE practices. They should have hands-on expertise with Kubernetes on Azure, strong Linux troubleshooting skills, and a proven ability to lead incident response and drive reliability improvements through automation and IaC.
Must-Have Skills
Bachelor's Degree in Computer Science or related discipline; OR demonstrated capability with High School Diploma/GED + 4+ years relevant experience5+ years of experience in Site Reliability Engineeringinfrastructure operationsDevOpsor production engineeringStrong Linux troubleshootingKubernetes experienceAzure cloud experienceObservability tools experienceProgramming/scripting: PythonGoor JavaInfrastructure as CodeOn-call rotation experienceUS work authorization; not eligible for visa sponsorship
Nice-to-Have Skills
7+ years of experience in Site Reliability Engineeringinfrastructure operationsDevOpsor production engineeringDefining and operationalizing SLOs and using error budgetsCapacity planning and demand forecastingMentoring engineers or acting as a technical leadExperience with additional cloud platforms or hybrid environments
Tools & Platforms
AzureKubernetesLinux
Required Skills
Bachelor's degree; 5+ years SRE; Linux; Kubernetes; Azure; observability; IaC; automation; Python/Go/Java; on-call; incident response; SLOs/SLIs; Incident Commander
Hard Skills
LinuxKubernetesAzureObservabilitymetricslogstracingalertingInfrastructure as CodePythonGoJavaAutomationinfrastructure-as-codeon-callIncident CommanderSLOsSLIspost-incident reviewson-call rotations
Soft Skills
Strong communicationLeadershipMentoringCollaborationDecision-making under pressureProblem-solving
Keywords for Your Resume
Senior Reliability EngineerSite Reliability EngineerInfrastructureAzureKubernetesLinuxTCP/IPDNSload balancingSLIsSLOsIncident Commanderon-callObservabilityautomationinfrastructure-as-codePythonGoJavapost-incident reviewson-call rotationsInfrastructure as Code
Deal Breakers
US work authorization required, Not eligible for visa sponsorship, Must pass pre-employment drug test
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile