Position Details

Type Full-Time

Experience lead

Exp. Years Not specified

Education Not specified

Category AI & Machine Learning

About this role

This role involves leading the inference routing and performance optimization team at Anthropic, focusing on system-level improvements to increase throughput and reduce latency across large AI inference fleets.

Key Responsibilities

Decide routing algorithm changes
Sequence system improvements
Debug latency issues
Build performance models
Coordinate fleet-wide efficiency

Technical Overview

Focuses on building and optimizing distributed systems, load balancing algorithms, cluster coordination, and performance tuning for AI inference infrastructure, working closely with kernel and network internals.

Ideal Candidate

The ideal candidate is a senior engineering leader with deep expertise in distributed systems, load balancing, and performance optimization for AI inference fleets. They should have experience managing complex system architectures and improving throughput and latency in large-scale AI infrastructure.

Must-Have Skills

Distributed SystemsLoad BalancingCluster CoordinationPerformance OptimizationLatency Analysis

Nice-to-Have Skills

Kernel DebuggingNetworkingML FrameworksSystem ArchitectureHigh Performance Computing

Tools & Platforms

KernelsML FrameworksNetworking ToolsCluster Management Software

Required Skills

Distributed SystemsLoad BalancingCluster CoordinationPerformance OptimizationLatency AnalysisKernel DebuggingNetworkingML FrameworksSystem ArchitectureHigh Performance Computing

Hard Skills

Distributed SystemsLoad BalancingCluster CoordinationPerformance OptimizationLatency AnalysisKernel DebuggingNetworkingML FrameworksSystem ArchitectureHigh Performance Computing

Soft Skills

LeadershipProblem-solvingTechnical Decision MakingCollaborationAnalytical ThinkingCommunicationTeam Management

Industry & Role

Industry Technology/AI

Job Function Engineering leadership for AI inference system performance

Role Subtype Engineering Manager

Tech Domains Distributed Systems, Networking, Kernel Internals, High Performance Computing, ML Frameworks

Keywords for Your Resume

distributed systemsload balancingcluster coordinationperformance optimizationlatency analysiskernel debuggingnetworkingml frameworkssystem architecturehigh performance computinginference routingAI systemsscalabilityfleet efficiencylatency spikes

Deal Breakers

Lack of experience with distributed systems, No background in AI infrastructure, Unfamiliar with load balancing or cluster coordination, No experience with performance tuning at scale

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Engineering Manager, Inference Routing and Performance

Get matched to jobs like this