About this role
Lucia Protocol is hiring a Senior Machine Learning Engineer to design, train, and deploy ML systems that power its attribution engine and behavioral intelligence platform. The role emphasizes ownership, early detection of design flaws, and shipping measurable improvements.
Key Responsibilities
- Design, train, and tune high-accuracy GBT models (XGBoost, LightGBM)
- Manage large dataset and model weights via DVC (Data Version Control)
- Standardize ML lifecycle with MLflow including Model Registry
- Use Weights & Biases (W&B) for performance visualization and hyperparameter sweeps
- Build stacking architecture using scikit-learn (StackingClassifier/StackingRegressor) with KNN + gradient boosted trees
Technical Overview
This role requires deep CUDA proficiency and GPU-aware scaling across local and enterprise tiers. You will use tooling for data and model lifecycle management (DVC, MLflow, Weights & Biases) and build stacking-based ensembling using scikit-learn with KNN combined with Gradient Boosted Trees (XGBoost/LightGBM).
Ideal Candidate
The ideal candidate is a Senior Machine Learning Engineer who has deep, hands-on CUDA experience including GPU memory hierarchy and kernel optimization. They have built production ML pipelines using DVC, MLflow (including Model Registry), and Weights & Biases, and can implement advanced ensembling with scikit-learn stacking (KNN + XGBoost/LightGBM).
Must-Have Skills
CUDA out of memory error debuggingGlobal vs. Shared memory understandingkernel optimizationCuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)MLflow Model RegistryWeights & Biases (W&B) hyperparameter sweep orchestrationscikit-learn StackingClassifier or StackingRegressorXGBoost or LightGBMK-Nearest Neighbors (KNN) + Gradient Boosted Tree (GBT) stacking architecture
Tools & Platforms
CUDACuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)MLflowWeights & Biases (W&B)scikit-learnXGBoostLightGBMRTX 5090NVIDIA H200AMD Instinct MI300XRTX 5090 setups
Required Skills
CUDAGPU memory hierarchy (Global vs. Shared memory)kernel optimizationCuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)MLflowModel RegistryWeights & Biases (W&B)hyperparameter sweep orchestrationscikit-learnStackingClassifierStackingRegressorK-Nearest Neighbors (KNN)KNeighborsTransformerGradient Boosted Tree (GBT)XGBoostLightGBMmulti-stage pipelinesRTX 5090NVIDIA H200AMD Instinct MI300X
Hard Skills
CUDAGPU memory hierarchy (Global vs. Shared memory)kernel optimizationdebugging CUDA out of memory errorsGPU compute tier optimizationNVIDIA H200AMD Instinct MI300XCuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)Git-like workflowsMLflowModel RegistryWeights & Biases (W&B)hyperparameter sweep orchestrationscikit-learnStackingClassifierStackingRegressorK-Nearest Neighbors (KNN)KNeighborsTransformerGradient Boosted Tree (GBT)XGBoostLightGBMmulti-stage pipelines
Soft Skills
leadershipinitiativeaccountabilityfull ownership of outcomesearly detection of design flawsraise red flags immediatelydrives resolution with urgencyrelentless ensuring product performs as requiredproactively moves roadmap forwardscientific rigorstartup scrappinessturning insight into shippedmeasurable impact
Keywords for Your Resume
Sr ML EngineerSenior Machine Learning EngineerMachine Learning EngineerCUDAGPU memory hierarchyGlobal vs. Shared memorykernel optimizationCuPyRAPIDScuDFcuMLNVRTCDVCData Version ControlMLflowModel RegistryWeights & BiasesW&Bhyperparameter sweepscikit-learnStackingClassifierStackingRegressorK-Nearest NeighborsKNNGradient Boosted TreeXGBoostLightGBMRAPIDS (cuDF/cuML)DVC (Data Version Control)MLflow Model RegistryWeights & Biases (W&B)
Deal Breakers
Strong CUDA hands-on experience including debugging CUDA out of memory errors, Experience with model versioning and lifecycle tools (DVC and MLflow Model Registry)
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile