About this role
This role involves developing scalable data pipelines and applications for processing aviation, geospatial, and weather data, supporting real-time and batch analytics for defense and aerospace clients.
Key Responsibilities
- Develop data pipelines
- Process aviation and geospatial data
- Implement scalable ETL/ELT workflows
- Support real-time streaming applications
- Collaborate on data platform development
Technical Overview
The technical environment includes Databricks, PySpark, Python, C++, Java, NoSQL databases like MongoDB and Cassandra, data warehousing solutions, and UNIX/Linux systems, with a focus on big data and real-time data processing.
Ideal Candidate
The ideal candidate is a mid-level data engineer with at least 2 years of experience in application development using Databricks or PySpark, with strong skills in programming languages like C++, Java, or Python. They should have experience working with aviation, geospatial, and weather datasets, and be able to develop scalable data pipelines.
Must-Have Skills
2+ years of experience in application development using Databricks or PySpark2+ years of experience utilizing programming languagesincluding C++Javaor Python2+ years of experience developing and maintaining scalable data storesExperience with aviationgeospatialand weather datasetsExperience creating software for retrievingparsingand processing structured and unstructured dataExperience with developing scalable ETL/ELT workflowsAbility to develop scripts and programs for data conversionAbility to obtain and maintain a Public Trust or Suitability/Fitness determinationBachelor's degree
Nice-to-Have Skills
Experience in application development utilizing SQL or ScalaExperience with a public cloud including AWSAzureor Google CloudExperience working on real-time data and streaming applicationsExperience with NoSQL implementation using MongoDB or CassandraExperience with data warehousing including AWS RedshiftMySQLSnowflakeExperience with UNIX/Linuxincluding Shell scriptingExperience with Agile practices
Tools & Platforms
DatabricksPySparkMongoDBCassandraAWS RedshiftMySQLSnowflakeUNIX/Linux
Required Skills
DatabricksPySparkPythonC++JavaData storesBig dataETLData pipelinesAviation datasetsGeospatial datasetsWeather datasetsData retrievalData parsingStructured dataUnstructured dataNoSQLMongoDBCassandraData warehousingAWS RedshiftMySQLSnowflakeUNIX/LinuxShell scriptingReal-time dataStreaming applicationsAgile engineering
Hard Skills
DatabricksPySparkPythonC++JavaData storesBig dataETLData pipelinesAviation datasetsGeospatial datasetsWeather datasetsData retrievalData parsingStructured dataUnstructured dataNoSQLMongoDBCassandraData warehousingAWS RedshiftMySQLSnowflakeUNIX/LinuxShell scriptingReal-time dataStreaming applicationsAgile engineering
Soft Skills
collaborativeteamworkproblem-solvingcommunicationadaptability
Keywords for Your Resume
Data EngineerDatabricksPySparkPythonC++JavaData storesBig dataETLData pipelinesAviation datasetsGeospatial datasetsWeather datasetsData retrievalData parsingStructured dataUnstructured dataNoSQLMongoDBCassandraData warehousingAWS RedshiftMySQLSnowflakeUNIX/LinuxShell scriptingReal-time dataStreaming applicationsAgile engineering
Deal Breakers
Less than 2 years of experience with Databricks or PySpark, No experience with aviation or geospatial datasets, Lack of Bachelor's degree, No ability to obtain or maintain Public Trust or Suitability/Fitness clearance
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile