Position Details

Type Not Specified

Experience mid

Exp. Years Not specified

Education Not specified

Category AI & Machine Learning

About this role

This role involves developing high-performance kernels for machine learning workloads on AWS's custom accelerators, working at the hardware-software boundary to optimize AI inference and training performance.

Key Responsibilities

Optimize ML kernels
Collaborate across hardware and software teams
Contribute to architecture design
Implement performance improvements
Mentor engineers

Technical Overview

The technical environment includes ML frameworks like PyTorch, C++, distributed systems, and hardware accelerators such as Inferentia and Trainium, focusing on performance tuning and kernel development.

Ideal Candidate

The ideal candidate is a mid-level AI/ML engineer with experience in hardware-software optimization, proficient in C++, and familiar with deep learning frameworks like PyTorch. They should have a strong background in high-performance computing and distributed architectures, with a passion for AI acceleration technology.

Must-Have Skills

low-level optimizationsystem architectureML model accelerationC++software engineering

Nice-to-Have Skills

ML frameworksPyTorchdistributed systemshardware knowledgeresearch publication

Tools & Platforms

AWS Neuron SDKPyTorchGitHubAWS

Required Skills

Machine LearningML compilerruntimePyTorchhigh-performance computingdistributed architectureskernel optimizationhardware-software boundaryML accelerationC++software developmentperformance optimization

Hard Skills

Machine LearningML compilerruntimePyTorchhigh-performance computingdistributed architectureskernel optimizationhardware-software boundaryML accelerationC++software developmentperformance optimization

Soft Skills

collaborationproblem-solvinginnovative thinkingmentoringcommunication

Industry & Role

Industry Technology / Cloud Computing / Artificial Intelligence

Job Function Developing and optimizing machine learning kernels for AI acceleration

Role Subtype Machine Learning Engineer

Tech Domains Amazon Web Services, PyTorch, Machine Learning, High-performance computing

Keywords for Your Resume

Machine LearningML compilerruntimePyTorchhigh-performance computingdistributed architectureskernel optimizationhardware-software boundaryML accelerationC++software developmentperformance optimizationAWS Neuron SDKAI accelerationdeep learning workloadsGPUTensorFlowAI hardwareperformance tuningsoftware engineeringmachine learningdistributed systemsdeep learning

Deal Breakers

Lack of experience in low-level optimization, No background in system architecture or ML acceleration, No proficiency in C++, No experience with hardware-software boundary

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

Get matched to jobs like this