✦ Luna Orbit — Data & Analytics

Data Platform Engineer, Fauna

at Amazon.com

📍 US, NY, New York Unknown Posted April 14, 2026
Type Full-Time
Experience mid
Exp. Years 3+ years
Education Bachelor's degree or above in computer science, computer engineering, or related field, or experience in data science, machine learning or data mining
Category Data & Analytics

Build the foundational data platform that powers robotics and machine learning development. You will design scalable ingestion, storage, processing, transformation, and real-time monitoring systems for robot telemetry, video, logs, and performance metrics.

  • Design and build scalable data pipelines for ingesting and processing robotics data (sensor streams, video, telemetry, logs)
  • Develop and maintain data storage solutions optimized for diverse data types and access patterns
  • Create tools and APIs for researchers and engineers to efficiently query and analyze large datasets
  • Build real-time data processing systems for monitoring robot fleet performance
  • Build and maintain data transformation pipelines that prepare robotics data for ML training

Own end-to-end robotics data pipeline architecture: ingest (sensor telemetry/video/logs), store across diverse access patterns, transform data for ML training, and implement real-time processing for fleet performance monitoring. Work with Kafka and/or Hadoop ecosystem components (Hive, Spark, HBase, YARN) and cloud infrastructure to deliver APIs and tooling for querying and analysis at scale.

The ideal candidate is a data engineer with 3+ years of experience building scalable data pipelines for ingesting and transforming high-volume robotics data. They have strong advanced SQL and Python automation skills, plus experience with Kafka and distributed systems on cloud infrastructure. They can partner with ML and robotics teams to deliver real-time processing and well-structured datasets for machine learning training.

data engineeringscalable data pipelinesadvanced SQLPythonKafkacloud computing technologiesdistributed systems
HiveApache SparkHBaseYARN
KafkaHiveApache SparkHBaseYARNPythonSQL
data engineeringscalable data pipelinesdata ingestiondata processingdata storage solutionsdata transformation pipelinesreal-time data processingadvanced SQLPythonautomation scriptingKafkaHiveSparkHBaseYARNcloud computing technologiesdistributed systemsAPIsdata queryingdata analysismachine learning training data
data engineeringscalable data pipelinesdata ingestiondata processingdata storage solutionsdata transformation pipelinesreal-time data processingrobotics datasensor telemetryvideo streamsoperational logsperformance metricsdata modelingadvanced SQLPythonautomation scriptingKafkaHiveApache SparkHBaseYARNsoftware developmentcloud computing technologiesdistributed systemsAPIsdata queryingdata analysismachine learning training data preparation
cross-functional collaborationcommunicationcollaborating with ML and robotics teamsensuring data accessibilityensuring data quality and structureproblem-solvingcustomer-focused engineering mindset
Industry Aerospace
Job Function Design and deliver scalable, real-time data pipelines and storage systems for robotics and ML training.
Role Subtype Data Engineer
Tech Domains Python, SQL / PostgreSQL, Amazon Web Services, Kubernetes, Linux
Data Platform Engineerdata platform engineeringdata pipelinesdata ingestiondata processingdata storagedata transformationreal-time data processingsensor telemetryvideo streamsoperational logsperformance metricsadvanced SQLPythonKafkaHiveSparkHbaseYarncloud computing technologiesdistributed systemsAPIsmachine learning trainingApache SparkHBase

3+ years of data engineering experience, Advanced SQL experience, Python scripting for automation, Experience with Kafka or Hive/Spark/HBase/YARN stack, Must have cloud computing technologies experience

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile