Position Details

Type Full-Time

Experience senior

Exp. Years 4+ years

Education Not specified

Category AI & Machine Learning

About this role

Scale AI seeks an ML Systems Engineer to build scalable backend platforms for serving foundation models in robotics and Physical AI, bridging research and production engineering.

Key Responsibilities

Build & scale fault-tolerant, high-performance systems for serving robotics models at scale
Platform development to enable model capability discovery for faster research iterations
Collaborate with robotics researchers and computer vision engineers to optimize models for production and research
Conduct architecture and design reviews for scalability, reliability, and security
Develop observability and monitoring for real-time performance of model inference

Technical Overview

You will design fault-tolerant, high-performance ML serving platforms, develop internal platforms for model capability discovery, and collaborate with robotics researchers and CV engineers. The tech stack includes Python/Go/Rust/C++, CUDA, Docker, Kubernetes, Terraform, and cloud providers (AWS/GCP) with GPU-accelerated inference and data pipelines.

Must-Have Skills

4+ years of experience building large-scalehigh-performance backend systemsCUDAkernel tuningPythonGoRustC++KubernetesDockerAWSAmazon Web ServicesGoogle Cloud PlatformTerraformInfrastructure as CodeROSROS2

Nice-to-Have Skills

Vision-Language-Action (VLA) modelsFFmpegNVDEC/NVENC3D data handling (point clouds)ROS/ROS2 familiarityAV data formats

Tools & Platforms

DockerKubernetesTerraformAmazon Web ServicesGoogle Cloud PlatformFFmpegNVDECNVENCROSROS2

Required Skills

PythonGoRustC++CUDAkernel tuningDockerKubernetesAWSAmazon Web ServicesGoogle Cloud PlatformTerraformInfrastructure as CodeFFmpegNVDEC/NVENCROSROS2GPUobservabilitymodel serving

Hard Skills

PythonGoRustC++CUDAkernel tuningDockerKubernetesAmazon Web ServicesGoogle Cloud PlatformTerraformInfrastructure as CodeFFmpegNVDECNVENCROSROS2

Soft Skills

CollaborationCommunicationProblem-solvingIndependenceLeadershipOwnership

Industry & Role

Industry SaaS

Job Function Develop and operate scalable ML serving platforms for robotics foundation models.

Role Subtype ML Systems Engineer

Keywords for Your Resume

ML Systems EngineerRoboticsPhysical AIfoundation modelsmachine learning infrastructureCUDAkernel tuningPythonGoRustC++DockerKubernetesTerraformInfrastructure as CodeAmazon Web ServicesGoogle Cloud PlatformFFmpegNVDECNVENCROSROS2observabilitymodel serving

Deal Breakers

Less than 4 years backend systems experience, No CUDA or GPU optimization experience, No Kubernetes or Docker experience, No Python/Go/C++ proficiency

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

ML Systems Engineer, Robotics

Get matched to jobs like this