Position Details

Salary $172K – $260K USD / year

Type Not Specified

Experience lead

Exp. Years 5+ years

Education Not specified

Category AI & Machine Learning

About this role

This role involves leading the development of scalable ML infrastructure for LLM post-training, evaluation, and deployment, with a focus on feedback systems and distributed workflows.

Key Responsibilities

Build ML evaluation and deployment pipelines
Manage feedback and reward systems
Optimize distributed training workflows
Ensure reproducibility and operational excellence
Collaborate with research and engineering teams

Technical Overview

The environment includes Python, ML infrastructure, distributed systems, and pipelines for training, evaluation, and feedback management of large-scale models.

Ideal Candidate

The ideal candidate is a lead ML engineer with over 5 years of experience in ML infrastructure, specializing in scalable evaluation, deployment pipelines, and feedback systems for LLMs. They possess strong leadership skills and a deep understanding of distributed systems and operational best practices.

Must-Have Skills

5+ years in software engineeringML systemsor distributed infrastructureexperience with ML infrastructure for LLMsbuilding scalable evaluation and deployment pipelinesexperience with feedback and reward systems

Nice-to-Have Skills

experience with large-scale training workflowsmodel evaluationfeedback-driven learningdistributed training optimizationsystem reliability

Tools & Platforms

PythonML infrastructuredistributed systemsevaluation pipelines

Required Skills

ML infrastructureevaluation pipelinesmodel deploymentdistributed systemstraining workflowsfeedback pipelinesexperiment managementreproducibilityversioningmonitoring

Hard Skills

ML infrastructureLLM post-trainingevaluation pipelinesmodel deploymentdistributed systemstraining workflowsfeedback pipelinesexperiment managementreproducibilityversioningmonitoringoperational excellence

Soft Skills

leadershipcross-functional collaborationproblem-solvinginnovative thinkingownership

Industry & Role

Industry SaaS

Job Function Leading scalable ML infrastructure for LLM post-training, evaluation, and deployment

Keywords for Your Resume

Deal Breakers

Less than 5 years of experience in ML infrastructure, Lack of experience with scalable evaluation pipelines, No background in distributed systems or feedback systems

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Lead Machine Learning Engineer, LLM Infrastructure

Get matched to jobs like this