Position Details

Type Full-Time

Experience mid

Exp. Years 3+ years

Education Bachelor's degree in computer science or equivalent

Category AI & Machine Learning

About this role

Runtime Software Development Engineer for AWS Neuron focusing on high-performance Linux drivers and HPC libraries to accelerate AI workloads on Inferentia. Builds distributed runtime software and collaborates with ML teams to optimize performance and scale.

Key Responsibilities

Design, develop, and deploy runtime software for AWS Neuron on Inferentia
Develop high-performance Linux drivers and HPC libraries (libfabric, MPI)
Collaborate with ML scientists to optimize performance
Mentor and review code; uphold coding standards and testing
Deliver scalable, fault-tolerant runtime systems for customers

Technical Overview

Expertise in Linux drivers, libfabric, MPI, and ML frameworks (TensorFlow, PyTorch, MXNet) in C/C+/Python; develops for distributed, embedded runtimes targeting AI accelerators; integrates with AWS Inferentia hardware.

Ideal Candidate

The ideal candidate is a mid-level runtime software engineer with 3+ years building distributed ML systems, strong Linux driver experience, and hands-on work with ML frameworks (TensorFlow, PyTorch, MXNet). They should excel at optimizing performance for AI accelerators and collaborate across teams to deliver scalable runtimes on AWS Inferentia.

Must-Have Skills

3+ years of non-internship professional software development experience2+ years of non-internship design or architecture (design patternsreliability and scaling) of new and existing systems experienceExperience programming with at least one software programming language

Nice-to-Have Skills

3+ years of full software development life cycleincluding coding standardscode reviewssource control managementbuild processestestingand operations experienceBachelor's degree in computer science or equivalent

Tools & Platforms

TensorFlowPyTorchMXNetMPIlibfabricLinux

Required Skills

3+ years of non-internship software development experience2+ years design or architecture (design patternsreliability and scaling) of new and existing systemsexperience programming with at least one software programming language

Hard Skills

high-performance Linux driverslibfabricMPITensorFlowPyTorchMXNetC/C++Pythonembedded softwaredistributed softwareLinuxInferentiaAWS

Soft Skills

communicationmentorshipteamworkproblem solvingownershipcuriosity

Industry & Role

Industry AI & Machine Learning

Job Function Develop and optimize runtime software for AI accelerators in AWS infrastructure.

Role Subtype Software Engineer

Tech Domains Amazon Web Services, Linux, TensorFlow, PyTorch, MXNet, C++, Python, MPI, libfabric

Keywords for Your Resume

Neuron Runtime SDERuntime Software Development EngineerAWS NeuronAmazon Web ServicesInferentiaLinuxhigh-performance Linux driverslibfabricMPITensorFlowPyTorchMXNetembedded softwaredistributed softwareC++CPythonLinux driversneuronaws neuronneuron runtime sdeinferentialinuxmpitensorflowpytorchmxnet

Deal Breakers

No Linux driver experience, No ML framework experience, No distributed systems experience, No experience with cloud platforms

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile