About this role

This role involves developing and optimizing AI/ML inference solutions on AWS hardware accelerators, focusing on large language models and deep learning frameworks.

Key Responsibilities

Architect and implement ML inference features
Optimize large language model performance
Collaborate with hardware and software teams
Support open source ecosystem integration
Mentor junior engineers

Technical Overview

The position requires expertise in ML frameworks like PyTorch and JAX, hardware accelerators such as Inferentia and Trainium, and performance optimization for distributed inference.

Ideal Candidate

The ideal candidate is a senior AI/ML engineer with extensive experience in deep learning frameworks, hardware accelerators, and distributed inference systems. They possess strong hardware knowledge and a track record of optimizing large-scale ML models for inference and training.

Must-Have Skills

Deep learningML compilerML inferencePyTorchJAXperformance tuningdistributed inferencehardware knowledge

Nice-to-Have Skills

GenAIlarge language modelsLLMML trainingoptimizationKernelsOpen source ecosystems

Tools & Platforms

AWS Neuron SDKPyTorchJAXInferentiaTrainium

Required Skills

AWSAmazon Web ServicesAWS NeuronML compilerruntimeapplication frameworkPyTorchJAXML inferenceML trainingdeep learningGenAIlarge language modelsLLMInferentiaTrainiumhardware-software boundarykernelsdistributed inferenceperformance tuning

Hard Skills

AWSAmazon Web ServicesAWS NeuronML compilerruntimeapplication frameworkPyTorchJAXML inferenceML trainingdeep learningGenAIlarge language modelsLLMInferentiaTrainiumhardware-software boundarykernelsdistributed inferenceoptimizationperformance tuning

Soft Skills

collaborationmentoringinfrastructure developmentinnovationproblem-solving

Industry & Role

Industry Technology

Job Function Developing high-performance AI inference solutions on AWS hardware accelerators

Keywords for Your Resume

AWSAmazon Web ServicesAWS NeuronML compilerruntimeapplication frameworkPyTorchJAXML inferenceML trainingdeep learningGenAIlarge language modelsLLMInferentiaTrainiumhardware-software boundarykernelsdistributed inferenceperformance tuninghardware knowledge

Deal Breakers

Lack of experience with AWS Neuron SDK, No hardware knowledge of Inferentia or Trainium, Insufficient experience in ML inference optimization

Senior Software Development Engineer - AI/ML, AWS Neuron, Multimodal Inference

Get matched to jobs like this