✦ Luna Orbit — Cloud & Infrastructure

Senior Systems Software Engineer, Kubernetes Scale - DGX Cloud

at Nvidia

📍 US, CA, Santa Clara Hybrid Posted March 13, 2026
Type Not Specified
Experience senior
Exp. Years 8+ years
Education Bachelor's/Master's in Engineering or equivalent experience
Category Cloud & Infrastructure

This role involves developing and optimizing distributed systems for AI workloads, focusing on performance, scalability, and cloud integration within NVIDIA's DGX Cloud platform.

  • Drive performance and scalability testing
  • Collaborate with AI researchers and developers
  • Debug and optimize Kubernetes clusters
  • Develop monitoring tools
  • Engage with open-source communities

The technical environment includes Kubernetes, open-source tools, cloud platforms like GCP, AWS, Azure, OCI, and programming in Golang and Python, with a focus on high-performance distributed systems.

The ideal candidate is a senior systems software engineer with at least 8 years of experience in distributed systems, open-source technologies like Kubernetes, and performance optimization. They possess deep expertise in cloud platforms and programming in Golang and Python, with a strong background in scaling AI infrastructure.

8+ years of experienceExpertise in KubernetesExperience with large scale parallel and distributed systemsPerformance optimizationProficiency in Golang and PythonExperience with cloud infrastructure (GCPAWSAzureOCI)
Operational experience with Kubernetes distributionsExperience with performance benchmarkingOpen-source community engagement
KubernetesGCPAWSAzureOCICI/CD pipelines
KubernetesContainersPerformance modelingBenchmarkingGolangPythonCloud platformsGCPAWSAzureOCICNCFDistributed systemsScalabilityOpen-source
KubernetesK8sContainersOpen-source technologiesPerformance modelingBenchmarkingGolangPythonCloud platformsGCPAmazon Web ServicesAWSAzureOCICNCF
CollaborationProblem-solvingCommunicationTeamworkAnalytical thinking
Industry AI & Machine Learning, Cloud Computing, High-Performance Computing
Job Function Designing and scaling distributed AI infrastructure systems
Senior Systems Software EngineerKubernetesContainersOpen-source technologiesPerformance modelingBenchmarkingGolangPythonGCPAWSAzureOCICNCFDistributed systemsPerformance optimizationScalabilityCloud platformsOpen-source toolsPerformance benchmarkingCloud infrastructureOpen-sourcePerformance testing

Less than 8 years of experience, Lack of expertise in Kubernetes, No experience with large-scale distributed systems, Proficiency only in non-relevant programming languages, Location outside Santa Clara, CA without remote options

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile