✦ Luna Orbit — Data & Analytics

Data Engineer, Amazon PeopleInsights eXperience (APIX)

at Amazon.com

📍 US, WA, Seattle Unknown Posted April 14, 2026
Type Not Specified
Experience mid
Exp. Years Not specified
Education Not specified
Category Data & Analytics

Build and maintain foundational data infrastructure and analytics tools for Amazon PeopleInsights eXperience (APIX). Design scalable data pipelines and data lake capabilities that transform complex HR Ops and employee experience data into actionable, self-serve insights.

  • Design and implement high performant and cost-efficient data lake infrastructure using AWS big data stack, Spark, Hive, SQL, Apache Airflow, AWS Glue, EMR, S3, Redshift and OLAP technologies
  • Collaborate with Business Intelligence Engineers to build semantic layers and optimize SQL queries
  • Follow software best practices including coding standards, code reviews, and testing
  • Work directly with customers to integrate new data types, curate data profiles, perform data quality checks, and incorporate feedback
  • Enable technical and non-technical customers to drive self-serve analytics and ad-hoc reporting; iterate via proof of concepts

Responsible for high-performance, cost-efficient data lake infrastructure on AWS using Spark, Hive, SQL, Apache Airflow, AWS Glue, Amazon EMR, Amazon S3, and Amazon Redshift with OLAP technologies. Collaborates on semantic layers, query optimization, testing/code reviews, and data quality workflows (profiling and validation), while supporting self-serve reporting.

The ideal candidate is a mid-level Data Engineer experienced building scalable data lake infrastructure and data pipelines in a big data environment. They have hands-on experience with AWS big data stack components (Spark, Hive, SQL, Apache Airflow, AWS Glue, EMR, S3, Redshift) and can optimize SQL for fast, cost-efficient analytics while enabling self-serve reporting and data quality workflows.

experience working with big databuilding data lakes and data processing servicesexperience with one or more query language (SQLPL/SQLHiveQLSparkSQL)experience with one or more scripting language (PythonScala)designbuildand maintain scalable data pipelines and infrastructureability to optimize SQL queries for fastcost-efficient access
Amazon Web Services (AWS)AWS big data stackAmazon EMR (Elastic MapReduce)Amazon S3 (Simple Storage Service)Amazon RedshiftAmazon Athena (implied by SQL OLAP? Not specified; omit)Apache AirflowAWS GlueHiveSparkSQLPL/SQLHiveQLSparkSQLPythonScala
Data lake infrastructureAWS big data stackSparkHiveSQLApache AirflowAWS GlueEMRS3RedshiftOLAPsemantic layersoptimize SQL queriescoding standardscode reviewstestingdata quality checksdata profilingself-serve reportingproof of conceptsPythonScalaPL/SQLHiveQLSparkSQL
data lake infrastructureAWS big data stackSparkHiveSQLApache AirflowAWS GlueAmazon EMRS3RedshiftOLAP technologiessemantic layers in reporting and analysisoptimize SQL queriescoding standardscode reviewstestingdata processing systemsquery languagePL/SQLHiveQLSparkSQLscripting languagePythonScaladata quality checksdata profilingbuilding data pipelinesanalytics toolsself-serve reportingproof of conceptsintegrating new data types
collaborative problem-solvingability to work in a degree of ambiguitywillingness to develop quick proof of conceptsiterate and improveinclusion culturecross-functional collaboration with product managersbusiness intelligence engineersand HR business partnerswork with technical and non-technical internal customerscustomer integration and feedback incorporation
Industry SaaS
Job Function Engineer scalable AWS data infrastructure and pipelines to deliver people analytics capabilities.
Role Subtype Data Engineer
Tech Domains Amazon Web Services, Python, SQL / PostgreSQL, Data & Analytics, Linux, Apache Airflow (not canonical), Kubernetes (not present)
Data EngineerAmazon PeopleInsights eXperience (APIX)APIXData teamdata infrastructuredata lakedata pipelinesscalable data pipelinesAWS big data stackSparkHiveSQLApache AirflowAWS GlueEMRAmazon EMR (Elastic MapReduce)S3RedshiftOLAPsemantic layersreporting and analysisoptimize SQL queriescoding standardscode reviewstestingdata quality checksdata profilingself-serve reportingproof of conceptsPythonScalaPL/SQLHiveQLSparkSQLAmazon EMRAmazon S3Amazon Redshift

Must have experience building data lakes and data processing services, Must have experience with one or more query language (SQL, PL/SQL, HiveQL, SparkSQL), Must have experience with one or more scripting language (Python, Scala)

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile