Position Details
About this role
This role involves designing and maintaining enterprise-scale data pipelines in a cloud environment, primarily using Python, PySpark, and Databricks, to support data sourcing, processing, and machine learning initiatives.
Key Responsibilities
- Design data pipelines
- Develop data models
- Optimize data workflows
- Implement CI/CD pipelines
- Collaborate with data scientists
Technical Overview
The technical environment includes Python, PySpark, Databricks, SQL, PostgreSQL, and AWS cloud services such as S3, Lambda, and EKS, with a focus on scalable data processing and automation.
Ideal Candidate
The ideal candidate is a mid-level data engineer with 4+ years of experience in building scalable data pipelines using Python, PySpark, and Databricks, with strong SQL skills and familiarity with AWS cloud services.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Less than 4 years of experience, Lack of Python or SQL skills, No experience with Databricks or PySpark, No cloud platform experience
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile