Position Details
About this role
This role involves designing and implementing scalable AI and ML infrastructure at NVIDIA, focusing on containerized environments, high-performance computing, and distributed systems to support AI-powered applications.
Key Responsibilities
- Design and build containers for NIM runtimes
- Develop tooling for build orchestration and CI/CD
- Optimize container performance and scalability
- Collaborate across teams for deployment
- Mentor teammates
Technical Overview
The technical environment includes Kubernetes, GPU infrastructure, Python, containerization, and open-source ML stacks, aimed at building reliable, scalable AI platforms for inference and training.
Ideal Candidate
The ideal candidate is a senior AI/ML engineer with extensive experience in building scalable ML infrastructure, proficient in Kubernetes, GPU computing, and Python, with a strong understanding of distributed systems and AI platform development.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with Kubernetes, No background in GPU infrastructure, Insufficient Python skills, No experience with distributed systems, Lack of AI/ML platform development experience
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile