Position Details
About this role
This role involves partnering with AI software teams and customers to enable large-scale training and inference on AMD GPUs, designing scalable Kubernetes architectures, and optimizing AI workloads.
Key Responsibilities
- Partner with teams to enable LLM training
- Design Kubernetes architectures
- Validate inference frameworks
- Optimize GPU workloads
- Collaborate with customers
Technical Overview
The technical environment includes AI infrastructure, Kubernetes, distributed training frameworks, GPU computing, and inference frameworks like vLLM and SGLang.
Ideal Candidate
The ideal candidate is a lead AI platform engineer with extensive experience in large-scale AI infrastructure, Kubernetes, and GPU-based distributed training. They should be solution-oriented, collaborative, and capable of designing scalable AI deployment architectures.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Clearance & Visa
Keywords for Your Resume
Deal Breakers
Lack of experience with Kubernetes or AI infrastructure, No experience with large language models, Unwillingness to work in a hybrid environment
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile