Position Details
About this role
This role involves developing high-performance kernels for machine learning workloads on AWS's custom accelerators, working at the hardware-software boundary to optimize AI inference and training performance.
Key Responsibilities
- Optimize ML kernels
- Collaborate across hardware and software teams
- Contribute to architecture design
- Implement performance improvements
- Mentor engineers
Technical Overview
The technical environment includes ML frameworks like PyTorch, C++, distributed systems, and hardware accelerators such as Inferentia and Trainium, focusing on performance tuning and kernel development.
Ideal Candidate
The ideal candidate is a mid-level AI/ML engineer with experience in hardware-software optimization, proficient in C++, and familiar with deep learning frameworks like PyTorch. They should have a strong background in high-performance computing and distributed architectures, with a passion for AI acceleration technology.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience in low-level optimization, No background in system architecture or ML acceleration, No proficiency in C++, No experience with hardware-software boundary
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile