About this role
Data Engineer II focusing on PySpark/Databricks in a federal civilian health program, building and optimizing data pipelines, data models, and analytics services in a Databricks/AWS environment.
Key Responsibilities
- Build and maintain PySpark data pipelines in the Databricks environment
- Optimize Spark jobs performance and resource usage
- Design, develop, and maintain backend software components and services
- Research and build proofs of concepts in the data space
- Write clean, well-structured code, perform code reviews
Technical Overview
Responsibilities include PySpark/Databricks data pipelines, Spark performance optimization, backend components, and governance in Databricks and AWS. Emphasis on distributed computing, ETL/ELT, and data modeling.
Ideal Candidate
The ideal candidate is a senior data engineer with 8+ years of experience in Python and Spark/Databricks, strong PostgreSQL data modeling, ETL/ELT pipelines, and AWS data services (S3, Glue, Lambda). They should be adept at working in a remote, agile environment and capable of leading data engineering initiatives for public-sector programs.
Must-Have Skills
Bachelor’s degree and 8 years of experienceStrong experience with Python / Apache SparkSolid understanding of data modelingETL processand distributed computingBachelor's degree in Computer ScienceComputer Engineering or related fieldStrong understanding of software design patternsdata structuresand algorithmsExperience with Agile development methodologiesAbility to work independently as well as in a teamStrong problem-solving and analytical skillsRelated experience in analytic programmingdata extractionquerying databases/data warehouses and data analysis
Nice-to-Have Skills
AWS Experience (S3EC2GlueLambdaetc)R experienceProfessional Databricks/Apache Spark Certification(s)SAS experience
Tools & Platforms
PythonApache SparkPySparkDatabricksAmazon Web ServicesS3GlueLambdaRSAS
Required Skills
Bachelor’s degree and 8 years of experience; Strong experience with Python / Apache Spark; Solid understanding of data modelingETL processand distributed computing; Bachelor's degree in Computer ScienceComputer Engineering or related field; Strong understanding of software design patternsdata structuresand algorithms; Experience with Agile development methodologies; Ability to work independently as well as in a team; Strong problem-solving and analytical skills; Related experience in analytic programmingdata extractionquerying databases/data warehouses and data analysis
Hard Skills
PythonApache SparkPySparkDatabricksSQL / PostgreSQLETL/ELT pipelinesData modelingGitLab CI/CDAWSS3GlueLambdaRSAS
Soft Skills
strong problem-solvinganalytical skillscommunicationteam collaborationself-motivation
Certifications
Preferred
Professional Databricks/Apache Spark Certification
Keywords for Your Resume
data engineer iipySparkDatabricksPythonApache SparkSQL / PostgreSQLETL / ELT pipelinesAWSS3GlueLambdaRSASdata modelingdistributed computingagiledocumentationPySparkETL/ELT pipelinesAmazon Web Services
Deal Breakers
Lack of 8+ years experience, No Python / Spark / Databricks experience, No SQL / PostgreSQL experience, No AWS familiarity
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile