Position Details

Salary $195K – $195K USD / year

Type Not Specified

Experience mid

Exp. Years Not specified

Education Not specified

Category DevOps & SRE

About this role

This role involves ensuring system reliability, automating infrastructure, and managing incident response within a financial services environment using cloud and container technologies.

Key Responsibilities

Drive reliability and performance, Automate infrastructure, Collaborate with cross-functional teams, Lead incident management, Build monitoring systems

Technical Overview

Stack includes cloud platforms (Azure, AWS, GCP), Kubernetes, Terraform, monitoring tools (Prometheus, Grafana), and scripting in Python, Go, or Java, with a focus on SRE principles.

Ideal Candidate

The ideal candidate is a mid-level SRE with strong expertise in cloud infrastructure (Azure, AWS, GCP), container orchestration (Kubernetes), and observability tools. They should have experience automating infrastructure, managing incidents, and mentoring engineering teams in a fast-paced financial environment.

Must-Have Skills

Cloud infrastructure (AzureAWSor GCP)Containerization (DockerKubernetes)Infrastructure as Code (TerraformHelm)Monitoring tools (PrometheusGrafanaSplunkAppDynamics)Programming skills in PythonGoor JavaSRE principles (SLAsSLOserror budgets)

Nice-to-Have Skills

DatadogIncident responseOn-call supportAgile methodologies

Tools & Platforms

TerraformKubernetesPrometheusGrafanaSplunkAppDynamicsDatadogPythonGoJavaAzureAWSGCP

Required Skills

TerraformKubernetesCI/CDPrometheusGrafanaSplunkAppDynamicsPythonGoJavaCloud InfrastructureAzureAWSGCPObservabilityMonitoringIncident ResponseOn-Call SupportSLAsSLOs

Hard Skills

TerraformKubernetesCI/CDPrometheusGrafanaAppDynamicsSplunkPythonGoJavaCloud InfrastructureAzureAWSGCPObservabilityMonitoringIncident ResponseOn-Call SupportSRE Principles

Soft Skills

CollaborationCommunicationTeamworkProblem-solvingTechnical ExcellenceMentoring

Industry & Role

Industry Financial Services / Banking

Job Function Site Reliability Engineering in a financial services setting

Keywords for Your Resume

Site Reliability EngineerSRETerraformKubernetesCI/CDPrometheusGrafanaSplunkAppDynamicsPythonGoJavaCloud InfrastructureAzureAWSGCPObservabilityMonitoringIncident ResponseOn-Call SupportSLAsSLOs

Deal Breakers

Lack of experience with cloud infrastructure (Azure, AWS, GCP), No experience with Kubernetes or Terraform, No familiarity with monitoring tools like Prometheus or Grafana, Lack of scripting/programming skills in Python, Go, or Java

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Site Reliability Engineer_Pipeline

Get matched to jobs like this