Position Details
About this role
This role involves developing and optimizing AI/ML models on AWS custom accelerators, focusing on distributed training support, performance tuning, and hardware-software integration.
Key Responsibilities
- Architect and implement distributed training support
- Optimize ML models for performance
- Mentor team members
- Collaborate across hardware and software teams
- Support large language models
Technical Overview
Focus on deep learning frameworks, distributed training, ML accelerators like AWS Trainium, kernel and runtime optimization, and model tuning in a collaborative environment.
Ideal Candidate
The ideal candidate is a mid-level AI/ML engineer with experience in deep learning frameworks like PyTorch and JAX, skilled in distributed training and model optimization, and familiar with ML accelerators such as AWS Trainium.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with ML frameworks, No knowledge of distributed training, No experience with model tuning, On-site only without remote options, Lack of collaboration skills
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile