✦ Luna Orbit — IT Support & Helpdesk

Technical Program Manager, Reliability Engineering

at Anthropic

📍 San Francisco, CA | Seattle, WA Hybrid Posted March 07, 2026
Type Not Specified
Experience mid
Exp. Years 7+ years
Education Not specified
Category IT Support & Helpdesk

This role involves owning and evolving the incident management program, leading incident response efforts, and driving reliability improvements across engineering teams.

  • Own incident management program
  • Lead incident response
  • Improve tooling and processes
  • Coordinate incident reviews
  • Drive reliability initiatives

Focus on incident response, reliability engineering, operational process development, tooling, and high-severity incident management in a fast-paced environment.

The ideal candidate is a senior incident manager or SRE with over 7 years of experience leading incident response programs in high-growth or infrastructure-intensive environments. They are highly organized, skilled in tooling and process development, and capable of managing high-severity incidents effectively.

7+ years in incident management or SREExperience leading incident response programsHigh-growth or infrastructure-heavy environment experienceOn-call responsibilitiesBuilding operational processes
Experience with incident toolingScaling operational processesReliability program initiatives
Incident management toolsMonitoring systemsCommunication platforms
Incident managementIncident responseReliability engineeringOperational excellenceRoot cause analysisOn-call managementProcess developmentToolingRemediationCommunication
Incident managementIncident responseReliability engineeringOperational excellenceRoot cause analysisOn-call managementProcess developmentToolingRemediationCommunication
OrganizationCommunicationLeadershipProblem-solvingCollaborationStress management
Industry Technology / SaaS / Infrastructure / Reliability Engineering
Job Function Manage and improve incident response and reliability processes to ensure operational stability.
Incident managementIncident responseReliability engineeringOperational excellenceRoot cause analysisOn-call managementProcess developmentToolingRemediationCommunicationHigh-severity incident responseScaling operational processesReliability outcomesIncident review

Less than 7 years of relevant experience, No experience with incident response programs, Lack of infrastructure or high-growth environment background, Inability to lead on-call or incident response efforts

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile