Position Details

Salary $155K – $205K USD / year

Type Full-Time

Experience mid

Exp. Years 5+ years

Education Not specified

Category AI & Machine Learning

About this role

This role involves developing and scaling AI inference platforms supporting autonomous vehicle AI workloads, focusing on model serving, distributed systems, and GPU optimization.

Key Responsibilities

Build inference platform
Implement model serving strategies
Optimize GPU utilization
Develop monitoring and observability
Collaborate with ML teams

Technical Overview

The position requires expertise in Python, C++, distributed systems, GPU utilization, and model serving frameworks like Triton and RayServe, with a focus on scalable, cloud-agnostic AI infrastructure.

Ideal Candidate

The ideal candidate is a senior ML infrastructure engineer with over 5 years of experience in building scalable, cloud-agnostic inference platforms, proficient in Python and C++, with hands-on experience in model serving frameworks like Triton and RayServe. They excel in distributed systems and GPU optimization.

Must-Have Skills

5+ years experience in ML systems or backend servicesProficiency in Python or C++Experience with ML inference and model serving frameworksKnowledge of distributed systemsExperience with GPU utilization and optimizationStrong communication skills

Nice-to-Have Skills

Experience with TritonRayServevLLMKnowledge of cloud platformsExperience with monitoring and observability toolsOpen source contributions

Tools & Platforms

TritonRayServevLLM

Required Skills

PythonC++ML inferenceModel serving frameworksTritonRayServevLLMDistributed systemsGPU utilizationModel versioningAuto-scalingMonitoringObservability

Hard Skills

PythonC++ML inferenceModel serving frameworksTritonRayServevLLMDistributed systemsGPU utilizationModel versioningAuto-scalingMonitoringObservabilityMetricsAI infrastructureCloud-agnostic platforms

Soft Skills

CollaborationProblem-solvingCommunicationTechnical leadershipTeamwork

Industry & Role

Industry Automotive / Technology

Job Function Developing scalable AI inference platforms for autonomous vehicle applications

Role Subtype ML Infrastructure Engineer

Tech Domains Python, C++, Kubernetes, Docker, Linux, Cloud Platforms

Keywords for Your Resume

ML inferencemodel serving frameworksTritonRayServevLLMdistributed systemsGPU utilizationmodel versioningauto-scalingmonitoringobservabilityPythonC++AI infrastructurecloud platformshigh performance backendmachine learning systemsscalable inferencedeep learningAI deployment

Deal Breakers

Less than 5 years experience in ML systems, No experience with model serving frameworks, Lack of proficiency in Python or C++, No knowledge of distributed systems

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Senior ML Infrastructure Engineer, Inference Platform

Get matched to jobs like this