Position Details
About this role
This role involves building and maintaining scalable data infrastructure, focusing on data pipelines, storage, and model evaluation systems. The engineer will improve data correctness, privacy, and cost-efficiency.
Key Responsibilities
- Redesign data pipelines
- Implement schema validation
- Optimize storage costs
- Improve telemetry and instrumentation
- Maintain data privacy guarantees
Technical Overview
Environment includes Spark, Databricks, Ray Data, and large-scale data pipelines. The focus is on data modeling, performance debugging, and instrumentation.
Ideal Candidate
The ideal candidate is a data infrastructure engineer with extensive experience in Spark, Ray Data, and large-scale data pipelines. They possess strong debugging skills, data modeling expertise, and a focus on correctness and cost-efficiency.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with Spark or Ray Data, No experience with data pipelines or storage systems, Inability to troubleshoot performance issues, No background in data modeling, Lack of ownership in previous roles
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile