✦ Luna Orbit — DevOps & SRE

Senior Staff Site Reliability Engineer

at Rocket.Chat

📍 Remote, US Remote Posted March 21, 2026
Type Full-Time
Experience senior
Exp. Years 7+ years
Education Not specified
Category DevOps & SRE

This role involves leading the infrastructure strategy and operational excellence for Rocket.Chat's deployment systems, focusing on cloud infrastructure, Kubernetes, automation, and monitoring across global deployments.

  • Lead infrastructure strategy
  • Guide platform roadmap
  • Ensure system reliability
  • Oversee deployment automation
  • Collaborate with engineering teams

The position requires expertise in cloud platforms (AWS, GCP, Azure), Kubernetes, Infrastructure as Code, deployment automation, and observability tools, supporting scalable, reliable SaaS infrastructure.

The ideal candidate is a senior-level Site Reliability Engineer with extensive experience in cloud infrastructure, Kubernetes, and automation tools. They possess strong leadership skills and are capable of guiding large-scale distributed system operations across multiple cloud platforms.

Strong background in software engineeringExperience designing and operating large-scale distributed systemsExpertise with Kubernetes and cloud platformsExperience with Infrastructure as Code toolsCI/CD and GitOps deployment systemsMonitoringloggingalerting systemsNetworking fundamentalsLeadership in infrastructure or platform teams
Experience with SaaS platformsMulti-cluster Kubernetes managementMulti-region cloud architecturesDeep expertise in cloud security
KubernetesAWSGoogle Cloud PlatformAzureTerraformPulumiAnsibleArgoCDPrometheusGrafanaLokiMongoDBRedis
GoPythonKubernetesAWSGCPAzureTerraformPulumiAnsibleArgoCDPrometheusGrafanaLokiMicroservicesDeployment automationCI/CDGitOpsContainerized systemsMongoDBRedisMulti-cluster KubernetesMulti-region architectures
GoPythonKubernetesAWSGCPAzureTerraformPulumiAnsibleArgoCDPrometheusGrafanaLokiNetworking fundamentalsCloud architectureMicroservicesDeployment automationCI/CDGitOpsContainerized systemsMongoDBRedisMulti-cluster KubernetesMulti-region architectures
LeadershipStrategic thinkingCollaborationProblem-solvingCommunicationAutomation mindset
Industry Technology / SaaS
Job Function Leading cloud infrastructure and reliability engineering for SaaS platform
Role Subtype DevOps & SRE
Tech Domains Kubernetes, Amazon Web Services, Google Cloud Platform, Microsoft Azure, Terraform, Pulumi, Ansible, Prometheus, Grafana, Loki, MongoDB, Redis
kubernetesawsgoogle cloud platformazureterraformpulumiansibleargo cdprometheusgrafanalokimicroservicesdeployment automationci/cdgitopscloud infrastructurelarge-scale distributed systemsinfrastructure as codemulti-cluster kubernetesmulti-region architecturesgcp

Less than 7 years experience, Lack of Kubernetes or cloud platform expertise, No experience with automation tools, Inability to work remotely

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile