Position Details
About this role
Cloud Site Reliability Engineer to drive reliability, scalability, and performance of cloud-based infrastructure with a strong software engineering and operations background, spanning AWS, Azure, and GCP in a hybrid/multi-cloud setting.
Key Responsibilities
- Design and maintain fault-tolerant architectures across cloud platforms
- Implement automated failover and redundancy
- Deploy and optimize cloud resources with IaC
- Lead real-time incident response and postmortems
- Collaborate across engineering, DevOps, and security teams
Technical Overview
Stack includes Python, PowerShell, Bash; cloud platforms AWS/Azure/GCP; containers and orchestration with Docker/Kubernetes; IaC with Terraform/Ansible; monitoring with Splunk, Azure Monitor, Dynatrace, CloudWatch; incident response and capacity planning.
Ideal Candidate
The ideal candidate is a senior cloud/SRE engineer with 10+ years of experience, hands-on expertise across AWS/Azure/GCP, and strong automation skills (Terraform, Ansible). They should excel at building reliable, scalable cloud platforms and working with DevOps, security, and engineering teams in a hybrid/multi-cloud environment.
Must-Have Skills
None listed
Required Skills
Industry & Role
Keywords for Your Resume
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile