About this role
As a Lead Data Engineer at Capital One, you design, develop, test, implement, and support cloud-based data solutions in collaboration with Agile teams. You work across distributed systems and big data technologies to deliver performance-tuned pipelines and real-world customer impact.
Key Responsibilities
- Design and build cloud-based data solutions with Agile teams
- Develop data engineering capabilities using Python, Spark, and cloud platforms
- Work with distributed microservices and full-stack systems
- Implement and test pipelines with unit tests and performance tuning
- Utilize big data/distributed tools and support real-time and streaming applications
Technical Overview
The role centers on Python and Spark-based data engineering, cloud computing on AWS (Amazon Web Services), and data warehousing with Redshift and/or Snowflake. It also emphasizes distributed data/computing tools (MapReduce, Hadoop, Hive, EMR, Kafka), real-time/streaming applications, and engineering rigor via unit tests, code reviews, and UNIX/Linux shell scripting.
Ideal Candidate
The ideal candidate is a lead-level data engineer with at least 4 years of application development experience and 2+ years in big data technologies, plus 1+ year of AWS (Amazon Web Services) or other cloud computing experience. They are strong in Python and Spark, have data warehousing experience with Redshift and/or Snowflake, and are comfortable with distributed computing tools, streaming, and UNIX/Linux with shell scripting.
Must-Have Skills
Bachelor's DegreeAt least 4 years of experience in application developmentAt least 2 years of experience in big data technologiesAt least 1 year experience with cloud computing (AWSMicrosoft AzureGoogle Cloud)
Nice-to-Have Skills
7+ years of experience in application development including PythonSQLScalaor Java4+ years of experience in a public cloud (AWSMicrosoft AzureGoogle Cloud)4+ years experience with Distributed data/computing tools (MapReduceHadoopHiveEMRKafkaSparkGurobior MySQL)4+ year experience working on real-time data and streaming applications4+ years of experience with NoSQL implementation (MongoCassandra)4+ years of data warehousing experience (Redshift or Snowflake)4+ years of experience with UNIX/Linux including basic commands and shell scripting2+ years of experience with Agile engineering practices
Tools & Platforms
Amazon Web Services (AWS)AWSMicrosoft AzureGoogle CloudAmazon RedshiftRedshiftSnowflakeMapReduceHadoopHiveEMRAmazon EMRKafkaSparkGurobiMySQLMongoMongoDBCassandraUNIX/Linuxshell scriptingAgile
Required Skills
PythonSparkGlueAWSAmazon Web Servicesbig data technologiescloud computingJavaScalaNoSQL databasesAmazon RedshiftSnowflakeunit testsAgiledistributed microservicesMapReduceHadoopHiveEMRKafkareal-time datastreaming applicationsMongoCassandraUNIX/Linuxshell scripting
Hard Skills
full-stack developmentfull-stack development tools and technologiesmachine learningdistributed microservicesfull stack systemsJavaScalaPythonOpen Source RDBMSNoSQL databasesAmazon RedshiftRedshiftSnowflakeunit testsapplication developmentbig data technologiescloud computingAWSAmazon Web ServicesMicrosoft AzureGoogle CloudMapReduceHadoopHiveEMRAmazon EMRKafkaSparkreal-time datastreaming applicationsNoSQL implementationMongoMongoDBCassandraUNIX/Linuxshell scriptingAgile engineering practicesAgile teams
Soft Skills
collaborationmentoringcode reviewiterative deliverylearning new technologiesparticipating in technology communitiescommunicationcustomer focus
Keywords for Your Resume
Lead Data EngineerLead Data Engineer (PythonSparkGlueAWS)Data EngineerPythonAWSAmazon Web Servicesbig data technologiescloud computingAmazon RedshiftRedshiftSnowflakeNoSQL databasesunit testsAgileAgile teamsdistributed microservicesJavaScalaOpen Source RDBMSUNIX/Linuxshell scriptingMapReduceHadoopHiveEMRAmazon EMRKafkareal-time datastreaming applicationsNoSQL implementationMongoMongoDBCassandradata warehousingNoSQL
Deal Breakers
Bachelor's Degree required, At least 4 years of experience in application development, At least 2 years of experience in big data technologies, At least 1 year experience with cloud computing (AWS, Microsoft Azure, Google Cloud), No visa sponsorship for this position
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile