Position Details

Type Not Specified

Experience mid

Exp. Years 3+ years

Education Bachelor's or Master's degree in Computer Science, Data Engineering, or a related field

Category Data & Analytics

About this role

This role involves designing, developing, and maintaining scalable data pipelines supporting AI research and production, with a focus on data governance, security, and automation.

Key Responsibilities

Design data pipelines
Collaborate with AI scientists
Apply data governance
Monitor data pipelines
Improve reliability

Technical Overview

The technical environment includes Python, SQL, NoSQL databases, cloud platforms (AWS, Azure, GCP), Kubernetes, CI/CD pipelines, data warehousing, and vector databases.

Ideal Candidate

The ideal candidate is a mid-level data engineer with 3+ years of experience in building scalable data pipelines, proficient in Python and SQL, with knowledge of cloud platforms like AWS, Azure, or GCP. They should have experience with data governance, automation, and working with NoSQL and vector databases.

Must-Have Skills

PythonSQLData pipelinesETL/ELT processesData governanceCloud platforms

Nice-to-Have Skills

JavaScalaNoSQL databasesVector embedding storesKnowledge graphsAutomation testingInfrastructure-as-code

Tools & Platforms

AWSAzureGCPKubernetesCI/CD tools

Required Skills

PythonSQLNoSQLData pipelinesETLELTData governanceCloud platformsAWSAzureGCPKubernetesCI/CDData warehousingLakehouseVector databasesKnowledge graphsAutomationSchema validationData securityData quality

Hard Skills

PythonJavaScalaSQLNoSQLData warehousingLakehouse platformsETLELTData pipelinesData governanceAutomationSchema validationVector databasesKnowledge graphsCloud platformsAWSAzureGCPOrchestration frameworksCI/CD

Soft Skills

CollaborationProblem-solvingTroubleshootingContinuous improvementDocumentationAutomationTeamwork

Industry & Role

Industry Healthcare IT

Job Function Build and maintain data infrastructure for AI initiatives

Role Subtype Data Engineer

Tech Domains Python, SQL / PostgreSQL, NoSQL, AWS, Azure, GCP, Kubernetes, CI/CD

Keywords for Your Resume

Data EngineerETLELTData pipelinesData governancePythonSQLNoSQLCloud platformsAWSAzureGCPKubernetesCI/CDData warehousingLakehouseVector databasesKnowledge graphsAutomationSchema validationData securityData qualityData engineerETL pipelines

Deal Breakers

Lack of experience with cloud platforms, No experience in data pipelines or ETL, No proficiency in Python or SQL, No data governance experience

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

AI Data Engineer

Get matched to jobs like this