Position Details
About this role
This role involves designing and building scalable deployment infrastructure for AI inference models across hardware accelerators, ensuring continuous, unattended deployment with high efficiency.
Key Responsibilities
- Own deployment orchestration
- Improve deployment scheduling
- Extend deployment observability
- Drive down cycle time
- Optimize fleet rollout strategies
Technical Overview
Focuses on AI inference deployment systems, resource management, automation, and monitoring across GPU, TPU, and Trainium hardware, with a strong emphasis on systems engineering and scalability.
Ideal Candidate
The ideal candidate is a software engineer with over 5 years of experience in building deployment and automation systems at scale, particularly involving inference deployment on accelerators like GPUs, TPUs, and Trainium. They should have strong systems design skills and experience managing resource-constrained environments.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Less than 5 years of relevant experience, Lack of experience with deployment systems at scale, No experience with GPU, TPU, or Trainium hardware
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile