Position Details

Type Not Specified

Experience senior

Exp. Years 10+ years

Education Degree in Computer Science, Computer Engineering, or related field (BS or MS) or equivalent experience

Category Cloud & Infrastructure

About this role

This role involves designing and implementing container and cloud infrastructure for NVIDIA's AI inference microservices, focusing on performance, scalability, and reliability in GPU-accelerated environments.

Key Responsibilities

Design and build containers for NIM runtimes
Develop Python tooling for orchestration and CI/CD
Optimize container performance and GPU utilization
Evolve container image strategies and registry topology
Collaborate across teams to ensure model availability

Technical Overview

The technical environment includes Kubernetes, Docker, containerd, OCI standards, Python tooling, Helm charts, NVIDIA GPU tools, and cloud services, with a focus on container orchestration, GPU workload optimization, and AI inference deployment.

Ideal Candidate

The ideal candidate is a highly experienced software engineer with over 10 years of expertise in containerization, Kubernetes, and GPU workloads, particularly in deploying AI inference microservices. They possess strong Python skills and deep knowledge of container build and deployment strategies for high-performance environments.

Must-Have Skills

10+ years building production softwarecontainersKubernetesPythonDocker/BuildKitcontainerd/OCIGPU workloads in KubernetesNVIDIA device pluginMIGCUDA drivers/runtimeresource isolationexpert knowledge of container image layeringmulti-arch buildsregistry workflowsexperience with cloud servicesstructured output for LLM inferenceKV-cacheLoRa adapter

Nice-to-Have Skills

Helm chart designOperatorsplatform APIsOpenAI APIHugging Face APIinference backends (vLLMSGLangTRT-LLM)benchmarking inference containersmulti-tenant multi-clusteredge/air-gapped container deliveryopen-source contributions

Tools & Platforms

KubernetesDockerHelmOperatorNVIDIA GPU toolsCI/CD tools

Required Skills

KubernetesDockerBuildKitcontainerdOCIPythonHelmOperatorCI/CDCUDAGPU workloadsNVIDIA device pluginMIGcontainer buildcontainer packagingcontainer deploymentmetricstracingcloud servicesLLM inferencestructured outputKV-cacheLoRa adapter

Hard Skills

KubernetesK8sDockerBuildKitcontainerdOCIPythonHelmOperatorCI/CDCUDAGPU workloadsNVIDIA device pluginMIGresource isolationcontainer buildcontainer packagingcontainer deploymentmetricstracingcloud servicesPython SDKs

Soft Skills

collaborationcommunicationmentoringteamworkproblem-solvinginfluencedesignhigh engineering standards

Industry & Role

Industry Technology

Job Function Designing and optimizing container and cloud infrastructure for AI inference microservices

Role Subtype Cloud & Infrastructure

Tech Domains Kubernetes, Docker, Python, Active Directory, Microsoft 365

Keywords for Your Resume

Senior Software EngineerKubernetesDockerBuildKitcontainerdOCIPythonHelmOperatorCI/CDCUDAGPU workloadsNVIDIA device pluginMIGcontainer buildcontainer packagingcontainer deploymentmetricstracingcloud servicesLLM inferencestructured outputKV-cacheLoRa adapter

Deal Breakers

Less than 10 years of experience in software engineering, Lack of Kubernetes or container experience, No experience with GPU workloads or NVIDIA tools, No proficiency in Python, No experience with container image layering or registry workflows

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Senior Software Engineer - NIM Factory Container and Cloud Infrastructure

Get matched to jobs like this