Position Details

Type Full-Time

Experience senior

Exp. Years 6+ years (Applied research experience) and 5+ years (building machine learning models or developing algorithms)

Education PhD, or Master's degree

Category AI & Machine Learning

About this role

This Senior Applied Scientist role leads the design of evaluation frameworks and data collection protocols for robotic capabilities. You will measure and stress-test robot behavior across real-world tasks, build teleoperation and operator interfaces for scalable data collection, and collaborate with engineering teams to deploy evaluation tooling into production.

Key Responsibilities

Design and implement evaluation frameworks for robot capabilities
Develop task definitions, success criteria, and benchmarking methodologies
Create and refine data collection protocols for high-quality datasets
Build and iterate on teleoperation workflows and operator interfaces
Integrate evaluation tooling, logging systems, and data pipelines and analyze results

Technical Overview

The technical scope spans robotics, machine learning, and human-in-the-loop evaluation. You will build benchmarking methodologies and evaluation policies, define task structures and success criteria, analyze failure modes/performance gaps, and integrate evaluation tools with logging systems and data pipelines within the robotics stack.

Ideal Candidate

The ideal candidate is a senior Applied Scientist with a PhD or Master’s degree and 6+ years of applied research experience in robotics and machine learning. They have hands-on expertise designing evaluation frameworks and benchmarking methodologies, building teleoperation-based data collection workflows, and leading projects through production deployment.

Must-Have Skills

Design and implement evaluation frameworksDevelop task definitionssuccess criteriaand benchmarking methodologiesCreate and refine data collection protocolsBuild and iterate on teleoperation workflows and operator interfacesAnalyze evaluation results to identify performance gaps and failure modesCollaborate with engineering teams to integrate evaluation toolinglogging systemsand data pipelinesPhDor Master's degree and 6+ years of applied research experience3+ years of industry or academic research experienceExperience with JavaC++or pother programming languages5+ years of building machine learning models or developing algorithms for business application experienceExperience leading technical initiatives and key deliverablesExperience designing evaluation methodologiesbenchmarksor experimental frameworks for large-scale ML models or robotic systemsFamiliarity with teleoperation systemssimulation environmentsor human-in-the-loop data collection

Required Skills

roboticsevaluation frameworksdata collection protocolsteleoperationevaluation policieshuman-in-the-loopdata pipelineslogging systemstask definitionssuccess criteriabenchmarking methodologiesreproducible evaluationdeep learningmodel developmentcontrolembodied AIJavaC++simulation environmentsproduction deploymentmentoring

Hard Skills

evaluation frameworksdata collection protocolsrobot capabilities evaluationstress-test robot behaviorreal-world tasksroboticsmachine learninghuman-in-the-loop systemsteleoperationevaluation policiestask definitionssuccess criteriabenchmarking methodologiesreproducible evaluation of policiesdata collection protocolsdata pipelinesoperator-facing interfacesteleoperation workflowsoperator interfacesperformance gap analysisfailure mode analysisrobotics stack integrationlogging systemsdata pipelines integrationdeep learning and model developmentrobotics systemscontrolembodied AIsimulation environmentshuman-in-the-loop data collectionlead technical projects from conception through production deploymentmentoring junior scientists and engineersJavaC++

Soft Skills

highly experimentalsystems-orientedcomfortable working across softwareroboticsand data pipelinesturning ambiguous capability goals into measurable and actionable evaluation systemscollaboration with engineering teamsmentorship

Industry & Role

Industry Aerospace

Job Function Build measurable evaluation systems for robotics using human-in-the-loop data collection and machine learning-driven improvement

Role Subtype AI Researcher

Tech Domains Amazon Web Services

Keywords for Your Resume

Senior Applied ScientistApplied Scientistroboticsmachine learninghuman-in-the-loopteleoperationevaluation frameworksevaluation policiesdata collection protocolsoperator-facing interfacestask definitionssuccess criteriabenchmarking methodologiesreproducible evaluation of policiesdata pipelineslogging systemsfailure modesperformance gapsproduction deploymentdeep learningmodel developmentcontrolembodied AIsimulation environmentsJavaC++mentoring junior scientistspatentspublication at top-tier conferences

Deal Breakers

Must have PhD or Master's degree plus 6+ years of applied research experience, Must have Java or C++ experience, Must have 5+ years building machine learning models or developing algorithms, Must have experience designing evaluation methodologies/benchmarks or experimental frameworks for robotic systems

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Senior Applied Scientist, Fauna

Get matched to jobs like this