✦ Luna Orbit — Consulting & Advisory

Engineering Manager, Agent Prompts & Evals

at Anthropic

📍 San Francisco, CA | New York City, NY Unknown Posted March 26, 2026
Type Not Specified
Experience lead
Exp. Years Not specified
Education Not specified
Category Consulting & Advisory

Lead the development of infrastructure and evaluation systems for AI models, overseeing prompt management, regression detection, and deployment pipelines to ensure reliable model performance.

  • Lead platform development
  • Manage model evaluation systems
  • Oversee prompt infrastructure
  • Detect regressions
  • Coordinate model launches

Focuses on building and managing AI evaluation platforms, model deployment workflows, and infrastructure, with expertise in CI/CD, version control, and model safety metrics.

The ideal candidate is an experienced engineering manager with a background in platform or devtools teams, strong leadership skills, and expertise in model evaluation, deployment, and infrastructure management in AI systems.

Experience leading platform or devtools teamsStrong understanding of model evaluationExperience with prompt engineeringInfrastructure managementCI/CD pipelinesVersion control systemsModel deployment experience
Experience with AI systemsKnowledge of model behavior metricsExperience with regression detection systemsFamiliarity with model safety and steerability
CI/CDversion controlmodel deploymentmodel evaluation tools
team leadershipplatform developmentmodel evaluationprompt engineeringinfrastructureCI/CDversion controlmodel behaviorregression detectionmodel deployment
team leadershipplatform developmentmodel evaluationprompt engineeringinfrastructureCI/CDversion controlmodel behaviorregression detectionmodel deployment
leadershipcollaborationcommunicationproblem-solvingadaptability
Industry Technology
Job Function Managing AI model evaluation and deployment infrastructure
Role Subtype Engineering Manager
Tech Domains Model evaluation, CI/CD, Version control, Model deployment, Infrastructure
engineering managerprompt engineermodel evaluationinfrastructureCI/CDversion controlmodel deploymentregression detectionplatform teamAI systemsmodel behaviormodel safetyleadershipteam managementdevtoolsprompt engineeringci/cdai systems

No leadership experience, Lack of experience with model evaluation or deployment, No background in infrastructure or CI/CD, Familiarity only with non-AI systems

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile