Position Details

Salary $90K – $180K USD / year

Type Full-Time

Experience senior

Exp. Years 5+ years

Education Bachelor's degree in Computer Science, Statistics, Mathematics, Engineering, or related field; Master's/PhD preferred

Category AI & Machine Learning

About this role

Design, build, and productionize machine learning solutions end-to-end with strong ML engineering practices and agentic AI systems. You will partner with product and engineering stakeholders to deliver measurable impact via scalable models, intelligent workflows, and evaluation frameworks.

Key Responsibilities

Build production ML systems with training/inference pipelines, serving patterns, CI/CD for ML, and observability
Develop agentic AI solutions using LLM-based agents and orchestration patterns
Build RAG/knowledge systems with indexing, chunking, embeddings, and reranking
Create evaluation and experimentation frameworks for ML and LLM/agent systems (golden sets, human-in-the-loop, A/B tests, guardrail metrics)
Collaborate with data engineering and platform teams on data quality, lineage, governance, and scalable infrastructure

Technical Overview

Build production ML systems with robust training/inference pipelines, model serving patterns, ML CI/CD, and observability for drift/performance/cost. Implement agentic AI using LLM-based agents (tool/function calling, orchestration, retrieval, memory) and RAG pipelines (indexing, chunking, embeddings, reranking), supported by offline/online evaluation and deployment using Airflow/Dagster/Prefect plus FastAPI/BentoML/TorchServe on Docker/Kubernetes.

Ideal Candidate

The ideal candidate is a Senior Data Scientist - MLE with 5+ years of production ML experience, strong Python skills, and proven ML engineering execution across pipelines, serving, CI/CD, and observability. They also have hands-on experience building agentic AI / LLM systems including tool/function calling and RAG workflows using orchestration frameworks such as LangGraph/LangChain, Semantic Kernel, and LlamaIndex.

Must-Have Skills

5+ years of experience building and deploying ML solutions in production environmentsStrong proficiency in PythonBuild production ML systems with robust training/inference pipelinesCI/CD and testing practices for MLMonitoring/observability (model metricsdriftlogging)Demonstrated ML engineering experiencePipeline orchestration (AirflowDagsterPrefect)Model packaging/serving (FastAPIBentoMLTorchServe)Containerization and deployment (DockerKubernetes)Experience with agentic AI / LLM systemsTool/function calling and agent orchestration frameworks (LangGraph/LangChainSemantic KernelLlamaIndex)

Tools & Platforms

AirflowDagsterPrefectFastAPIBentoMLTorchServeDockerKubernetesLangGraph/LangChainSemantic KernelLlamaIndex

Required Skills

Pythonscikit-learnXGBoost/LightGBMPyTorchTensorFlowproduction ML systemstraining/inference pipelinesmodel serving patternsCI/CD for MLobservability drift performance costagentic AI systemsVLM/LLM-powered agentstool/function callingorchestrationevaluationRAGretrieval-augmented generationindexingchunkingembeddingsrerankingoffline/online evaluation frameworksgolden setshuman-in-the-loop reviewA/B testsguardrail metricsAirflowDagsterPrefectFastAPIBentoMLTorchServeDockerKubernetesLangGraph/LangChainSemantic KernelLlamaIndex

Hard Skills

Pythonscikit-learnXGBoost/LightGBMPyTorchTensorFlowproduction ML systemstraining/inference pipelinesmodel serving patternsCI/CD for MLobservability (driftperformancecost)agentic AI systemsVLM/LLM-powered agentstool useorchestrationevaluationLLM-based agentsplanningtool/function callingretrievalmemorymulti-agent patternsretrieval-augmented generation (RAG)indexingchunkingembeddingsrerankinglatencyqualitycostoffline/online evaluation frameworksgolden setshuman-in-the-loop reviewA/B testsguardrail metricsdata qualitylineagegovernancescalable infrastructureairflowDagsterPrefectFastAPIBentoMLTorchServeDockerKubernetesmonitoring/observabilitydriftloggingstatisticsexperimental designcausal/measurement thinkingLangGraph/LangChainSemantic KernelLlamaIndexstructured outputsagent orchestration frameworks

Soft Skills

Partner with productengineeringand business stakeholdersStakeholder communicationTranslate complex technical results into clear business outcomes and recommendationsDeliver measurable impact

Industry & Role

Industry Retail

Job Function Produce end-to-end, production ML and agentic AI capabilities with strong engineering, evaluation, and deployment practices.

Role Subtype ML Engineer

Tech Domains Python, Docker, Kubernetes, Machine Learning, AI & Machine Learning

Keywords for Your Resume

SeniorData Scientist - MLESenior Data Scientistmachine learning solutions end-to-endML engineering best practicesagentic AI systemsVLM/LLM-powered agentstool useorchestrationevaluationretrieval-augmented generationRAGindexingchunkingembeddingsrerankingoffline/online evaluation frameworksgolden setshuman-in-the-loop reviewA/B testsguardrail metricsPythonscikit-learnXGBoostLightGBMPyTorchTensorFlowAirflowDagsterPrefectFastAPIBentoMLTorchServeDockerKubernetesCI/CDmonitoring/observabilitymodel metricsLangGraphLangChainSemantic KernelLlamaIndexSenior Data Scientist - MLE

Deal Breakers

Must have 5+ years of experience building and deploying ML solutions in production environments, Must have strong proficiency in Python, Must have demonstrated ML engineering experience including at least one of: Airflow/Dagster/Prefect and model serving with FastAPI/BentoML/TorchServe and deployment with Docker/Kubernetes

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Senior, Data Scientist - MLE

Get matched to jobs like this