Position Details
About this role
This role involves leading the inference routing and performance optimization team at Anthropic, focusing on system-level improvements to increase throughput and reduce latency across large AI inference fleets.
Key Responsibilities
- Decide routing algorithm changes
- Sequence system improvements
- Debug latency issues
- Build performance models
- Coordinate fleet-wide efficiency
Technical Overview
Focuses on building and optimizing distributed systems, load balancing algorithms, cluster coordination, and performance tuning for AI inference infrastructure, working closely with kernel and network internals.
Ideal Candidate
The ideal candidate is a senior engineering leader with deep expertise in distributed systems, load balancing, and performance optimization for AI inference fleets. They should have experience managing complex system architectures and improving throughput and latency in large-scale AI infrastructure.
Must-Have Skills
Nice-to-Have Skills
Tools & Platforms
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience with distributed systems, No background in AI infrastructure, Unfamiliar with load balancing or cluster coordination, No experience with performance tuning at scale
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile