Position Details

Salary $150K – $200K USD / year

Type Not Specified

Experience senior

Exp. Years Not specified

Education Not specified

Category AI & Machine Learning

About this role

Lucia Protocol is hiring a Senior Machine Learning Engineer to design, train, and deploy ML systems that power its attribution engine and behavioral intelligence platform. The role emphasizes ownership, early detection of design flaws, and shipping measurable improvements.

Key Responsibilities

Design, train, and tune high-accuracy GBT models (XGBoost, LightGBM)
Manage large dataset and model weights via DVC (Data Version Control)
Standardize ML lifecycle with MLflow including Model Registry
Use Weights & Biases (W&B) for performance visualization and hyperparameter sweeps
Build stacking architecture using scikit-learn (StackingClassifier/StackingRegressor) with KNN + gradient boosted trees

Technical Overview

This role requires deep CUDA proficiency and GPU-aware scaling across local and enterprise tiers. You will use tooling for data and model lifecycle management (DVC, MLflow, Weights & Biases) and build stacking-based ensembling using scikit-learn with KNN combined with Gradient Boosted Trees (XGBoost/LightGBM).

Ideal Candidate

The ideal candidate is a Senior Machine Learning Engineer who has deep, hands-on CUDA experience including GPU memory hierarchy and kernel optimization. They have built production ML pipelines using DVC, MLflow (including Model Registry), and Weights & Biases, and can implement advanced ensembling with scikit-learn stacking (KNN + XGBoost/LightGBM).

Must-Have Skills

CUDA out of memory error debuggingGlobal vs. Shared memory understandingkernel optimizationCuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)MLflow Model RegistryWeights & Biases (W&B) hyperparameter sweep orchestrationscikit-learn StackingClassifier or StackingRegressorXGBoost or LightGBMK-Nearest Neighbors (KNN) + Gradient Boosted Tree (GBT) stacking architecture

Tools & Platforms

CUDACuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)MLflowWeights & Biases (W&B)scikit-learnXGBoostLightGBMRTX 5090NVIDIA H200AMD Instinct MI300XRTX 5090 setups

Required Skills

CUDAGPU memory hierarchy (Global vs. Shared memory)kernel optimizationCuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)MLflowModel RegistryWeights & Biases (W&B)hyperparameter sweep orchestrationscikit-learnStackingClassifierStackingRegressorK-Nearest Neighbors (KNN)KNeighborsTransformerGradient Boosted Tree (GBT)XGBoostLightGBMmulti-stage pipelinesRTX 5090NVIDIA H200AMD Instinct MI300X

Hard Skills

CUDAGPU memory hierarchy (Global vs. Shared memory)kernel optimizationdebugging CUDA out of memory errorsGPU compute tier optimizationNVIDIA H200AMD Instinct MI300XCuPyRAPIDS (cuDF/cuML)NVRTCDVC (Data Version Control)Git-like workflowsMLflowModel RegistryWeights & Biases (W&B)hyperparameter sweep orchestrationscikit-learnStackingClassifierStackingRegressorK-Nearest Neighbors (KNN)KNeighborsTransformerGradient Boosted Tree (GBT)XGBoostLightGBMmulti-stage pipelines

Soft Skills

leadershipinitiativeaccountabilityfull ownership of outcomesearly detection of design flawsraise red flags immediatelydrives resolution with urgencyrelentless ensuring product performs as requiredproactively moves roadmap forwardscientific rigorstartup scrappinessturning insight into shippedmeasurable impact

Industry & Role

Industry SaaS

Job Function Own and build production machine learning systems, from GPU-optimized training to model lifecycle and advanced ensembling.

Role Subtype MLOps Engineer

Tech Domains Amazon Web Services, Python, Linux, Kubernetes, Docker

Keywords for Your Resume

Sr ML EngineerSenior Machine Learning EngineerMachine Learning EngineerCUDAGPU memory hierarchyGlobal vs. Shared memorykernel optimizationCuPyRAPIDScuDFcuMLNVRTCDVCData Version ControlMLflowModel RegistryWeights & BiasesW&Bhyperparameter sweepscikit-learnStackingClassifierStackingRegressorK-Nearest NeighborsKNNGradient Boosted TreeXGBoostLightGBMRAPIDS (cuDF/cuML)DVC (Data Version Control)MLflow Model RegistryWeights & Biases (W&B)

Deal Breakers

Strong CUDA hands-on experience including debugging CUDA out of memory errors, Experience with model versioning and lifecycle tools (DVC and MLflow Model Registry)

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile