Position Details
About this role
Seeking a data engineer to develop large-scale cloud-based data pipelines supporting AI and ML initiatives, with expertise in distributed data systems and cloud platforms.
Key Responsibilities
- Build scalable data pipelines
- Enable analytics and personalization
- Develop ML data workflows
- Collaborate with data scientists
- Design data ingestion and governance frameworks
Technical Overview
Environment includes Hadoop, Spark, Hive, Presto, Databricks, cloud storage solutions (S3, Azure Blob), and data formats like Parquet and ORC, with a focus on scalable data workflows.
Ideal Candidate
An experienced data engineer with over 8 years of expertise in building large-scale, fault-tolerant data pipelines on cloud platforms like AWS and Azure. Skilled in distributed data technologies such as Hadoop, Spark, and Hive, with a focus on data governance and ML workflows.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Less than 8 years of experience, Lack of cloud platform experience (Databricks, S3, EMR), No experience with distributed data technologies, Unwillingness to work in a collaborative environment
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile