✦ Luna Orbit — AI & Machine Learning

Senior/Staff Software Engineer- Machine Learning Infrastructure, Slack

at Salesforce

📍 4 Locations Unknown 💰 $172K – $313K USD / year Posted March 18, 2026
Salary $172K – $313K USD / year
Type Not Specified
Experience senior
Exp. Years Not specified
Education Not specified
Category AI & Machine Learning

This role involves designing and implementing scalable AI and ML infrastructure at NVIDIA, focusing on containerized environments, high-performance computing, and distributed systems to support AI-powered applications.

  • Design and build containers for NIM runtimes
  • Develop tooling for build orchestration and CI/CD
  • Optimize container performance and scalability
  • Collaborate across teams for deployment
  • Mentor teammates

The technical environment includes Kubernetes, GPU infrastructure, Python, containerization, and open-source ML stacks, aimed at building reliable, scalable AI platforms for inference and training.

The ideal candidate is a senior AI/ML engineer with extensive experience in building scalable ML infrastructure, proficient in Kubernetes, GPU computing, and Python, with a strong understanding of distributed systems and AI platform development.

KubernetesGPU infrastructurePythonDistributed systemsML model deploymentModel trainingInferenceHigh performance platforms
vLLMKubeRayOpen source ML stacksScalable ML systemsAI platform development
KubernetesvLLMKubeRayOpen source ML stacksGPU infrastructure
Machine LearningML InfrastructureKubernetesGPUPythonDistributed SystemsModel TrainingModel DeploymentInferenceMonitoring
Machine LearningML InfrastructureKubernetesGPUvLLMKubeRayPythonOpen source ML stacksModel trainingModel deploymentInferenceMonitoringDistributed systemsHigh performance platforms
CollaborationCommunicationProblem-solvingArchitectural decision-makingTeamwork
Industry SaaS
Job Function Developing scalable container and cloud infrastructure for AI/ML applications
Role Subtype AI & Machine Learning
Tech Domains Kubernetes, Python, GPU, Active Directory, Linux
machine learningML infrastructureKubernetesGPUmodel trainingmodel deploymentinferencemonitoringdistributed systemshigh performance platformsPythonAI platformSlack AIML lifecyclescalable systemslarge scale systemsAI-powered applicationsML pipelinesmodel servingmodel inferenceGPU infrastructureML deployment

Lack of experience with Kubernetes, No background in GPU infrastructure, Insufficient Python skills, No experience with distributed systems, Lack of AI/ML platform development experience

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile