Position Details
About this role
This role involves developing and optimizing AI/ML inference solutions on AWS hardware accelerators, focusing on large language models and deep learning frameworks.
Key Responsibilities
- Architect and implement ML inference features
- Optimize large language model performance
- Collaborate with hardware and software teams
- Support open source ecosystem integration
- Mentor junior engineers
Technical Overview
The position requires expertise in ML frameworks like PyTorch and JAX, hardware accelerators such as Inferentia and Trainium, and performance optimization for distributed inference.
Ideal Candidate
The ideal candidate is a senior AI/ML engineer with extensive experience in deep learning frameworks, hardware accelerators, and distributed inference systems. They possess strong hardware knowledge and a track record of optimizing large-scale ML models for inference and training.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with AWS Neuron SDK, No hardware knowledge of Inferentia or Trainium, Insufficient experience in ML inference optimization
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile