Position Details

Type Not Specified

Experience mid

Exp. Years Not specified

Education Not specified

Category Data & Analytics

About this role

This role involves building and maintaining scalable data infrastructure, focusing on data pipelines, storage, and model evaluation systems. The engineer will improve data correctness, privacy, and cost-efficiency.

Key Responsibilities

Redesign data pipelines
Implement schema validation
Optimize storage costs
Improve telemetry and instrumentation
Maintain data privacy guarantees

Technical Overview

Environment includes Spark, Databricks, Ray Data, and large-scale data pipelines. The focus is on data modeling, performance debugging, and instrumentation.

Ideal Candidate

The ideal candidate is a data infrastructure engineer with extensive experience in Spark, Ray Data, and large-scale data pipelines. They possess strong debugging skills, data modeling expertise, and a focus on correctness and cost-efficiency.

Must-Have Skills

Experience with Spark (Databricks or open-source Spark)Production experience with Ray DataOwnership of large data pipelines and storage systemsDebugging performance issuesData modeling and maintainability

Nice-to-Have Skills

Schema evolutionData validationData retentionCompressionCost optimization

Tools & Platforms

SparkDatabricksRay Data

Required Skills

SparkDatabricksRay DataData pipelinesStorage systemsPerformance debuggingData modelingPrivacy Mode

Hard Skills

SparkDatabricksApache SparkRay DataData pipelinesStorage systemsPerformance debuggingData modelingPrivacy Mode

Soft Skills

Deep experienceOwnershipDesign thinkingJudgmentTroubleshootingCollaboration

Industry & Role

Industry SaaS, Data Infrastructure

Job Function Build and operate scalable data infrastructure for model signals and telemetry

Keywords for Your Resume

SparkDatabricksApache SparkRay DataData pipelinesData storagePerformance debuggingData modelingPrivacy ModeData validationSchema evolutionData retentionCompressionCost optimizationTelemetryModel evaluation

Deal Breakers

Lack of experience with Spark or Ray Data, No experience with data pipelines or storage systems, Inability to troubleshoot performance issues, No background in data modeling, Lack of ownership in previous roles

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Software Engineer, Data Infrastructure

Get matched to jobs like this