✦ Luna Orbit — AI & Machine Learning

Senior ML Engineer, Fauna

at Amazon.com

📍 US, NY, New York Unknown Posted April 14, 2026
Type Full-Time
Experience senior
Exp. Years 5+ years
Education Bachelor's degree or above in computer science, machine learning, engineering, or related fields, or Master's degree
Category AI & Machine Learning

Senior ML Engineer role focused on building and scaling the machine learning systems behind intelligent robots. You will design training, evaluation, experiment management, and deployment infrastructure with low-latency serving on robotic and edge hardware.

  • Design and build scalable ML training infrastructure (distributed training pipelines, GPU cluster management)
  • Develop systems for experiment tracking, model versioning, and reproducibility
  • Build deployment infrastructure for serving ML models on robotic hardware with strict latency requirements
  • Optimize model inference for edge devices and embedded systems
  • Collaborate with research teams to accelerate experimentation to production

Work spans distributed GPU training infrastructure, experiment tracking and reproducible model versioning, and robust deployment/serving pipelines for robotic hardware. The role emphasizes optimizing inference for edge and embedded systems and using Kubernetes/Docker container ecosystems.

The ideal candidate is a Senior ML Engineer with 5+ years of professional software development and extensive experience designing and scaling ML training and deployment infrastructure. They have strong Large Language Model fundamentals, hands-on experience with distributed training on GPU clusters, and can optimize inference for edge and embedded systems in robotics contexts.

5+ years of non-internship professional software development experience5+ years of programming with at least one software programming language experience5+ years of leading design or architecture (design patternsreliability and scaling) of new and existing systems experienceExperience as a mentortech lead or leading an engineering teamExperience with Machine Learning and Large Language Model fundamentalsincluding architecturetraining/inference lifecyclesand optimization of model executionor experience in development in the last 3 yearsExperience with machine learning (ML) tools and methodsExperience in KubernetesDocker or containers ecosystemor experience that includes strong analytical skillsattention to detailand effective communication abilities and experience with programming/scripting (BatchVBPowerShellJavaC#ChefPerlRuby and/or PHP)
Experience building and operating a cloud-based architectureExperience with robotics data (sensor streamsvideopoint clouds) and real-time inference systemsFamiliarity with model optimization techniques (quantizationpruningdistillation)Experience with reinforcement learning or simulation-based training pipelines
KubernetesDockercontainers ecosystemGPU cluster managementcloud-based architecturerobotic hardware
machine learningLarge Language Model fundamentalsML trainingdistributed training pipelinesGPU cluster managementexperiment trackingmodel versioningreproducibilitydeployment infrastructureserving ML modelsroboticsedge devicesembedded systemsKubernetesDockercontainers ecosystemquantizationpruningdistillationreinforcement learningBatchVBPowerShellJavaC#ChefPerlRubyPHP
machine learningLarge Language Model fundamentalsML trainingML inferenceML model architecturetraining/inference lifecyclesoptimization of model executionexperiment trackingmodel versioningreproducibilitydistributed training pipelinesGPU cluster managementdeployment infrastructureML model servingrobotic hardwareedge devicesembedded systemsquantizationpruningdistillationKubernetesDockercontainers ecosystemdata pipelinesdata labeling infrastructurerobot locomotionrobot perceptionrobot manipulationrobot navigationhuman-robot interactionprogramming/scriptingBatchVBPowerShellJavaC#ChefPerlRubyPHP
mentoringtech lead experiencecollaboration with research teamscross-functional collaborationeffective communication abilitiesattention to detailleading design or architecturereliability and scaling mindset
Industry E-commerce
Job Function Build and scale ML training and deployment infrastructure for robotics.
Role Subtype ML Engineer
Tech Domains Amazon Web Services, Kubernetes, Docker, AI & Machine Learning
Senior ML Engineermachine learning systemsML training infrastructuredistributed training pipelinesGPU cluster managementexperiment trackingmodel versioningreproducibilitydeployment infrastructureserving ML modelsrobot locomotionrobot perceptionrobot manipulationrobot navigationhuman-robot interactionedge devicesembedded systemsKubernetesDockercontainers ecosystemLarge Language Model fundamentalstraining/inference lifecyclesoptimization of model executionquantizationpruningdistillationreinforcement learning

Must have 5+ years of programming with at least one software programming language, Must have 5+ years leading design or architecture (reliability and scaling) of new and existing systems, Must have Large Language Model fundamentals experience (architecture, training/inference lifecycles, and optimization of model execution)

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile