Position Details
About this role
Design, implement, and maintain complex batch and real-time data pipelines and architectures. Ensure data quality, optimize data delivery, and support data scientists by integrating diverse datasets for machine learning and deep learning.
Key Responsibilities
- Develop and maintain data pipelines and data architectures
- Create and document physical data models, data dictionaries, and data flow diagrams
- Gather complex business requirements and define data requirements
- Build secure batch and real-time (streaming) data processing solutions in cloud and/or on-premises
- Mentor Data Engineers and partner with data scientists to integrate datasets for machine learning and deep learning
Technical Overview
The role focuses on data pipeline development (large-scale batch and real-time) across cloud platforms and on-premises environments, with strong data modeling deliverables like physical data models, data dictionaries, and data flow diagrams. It includes DevOps and enterprise architecture standards, secure handling of structured and unstructured data, and collaboration with data scientists on ML/deep learning-ready datasets.
Ideal Candidate
The ideal candidate is an early-career data engineer who can design, implement, and maintain complex batch and real-time data pipelines. They have strong data modeling fundamentals (physical data models, data dictionaries, data flow diagrams) and are able to ensure data quality and secure structured and unstructured data. They also collaborate with data scientists on integrating datasets for machine learning and deep learning models, and contribute to DevOps and enterprise architecture practices.
Must-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Bachelor's Degree, Advanced English skills
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile