Position Details
About this role
This role involves deploying and maintaining AI/ML infrastructure, focusing on GPU clusters, containerized workflows, and automation to support advanced AI applications.
Key Responsibilities
- Deploy GPU clusters
- Manage AI workflows
- Implement automation
- Troubleshoot infrastructure issues
- Support scalable AI systems
Technical Overview
Hands-on experience with GPU clusters, Kubernetes, Python, ML frameworks like PyTorch and TensorFlow, and Linux systems. Focus on automation, troubleshooting, and infrastructure scaling.
Ideal Candidate
The ideal candidate is a senior AI/ML infrastructure engineer with over 5 years of experience in deploying GPU clusters, managing containerized AI workflows, and supporting scalable AI platforms, with strong Kubernetes and Python skills.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Less than 5 years experience, No Kubernetes experience, Lack of Python or ML frameworks knowledge, No experience with GPU clusters
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile