✦ Luna Orbit — Project & Program Management

Sr. Technical Program Manager, Global Data Center Operations (Central Operations)

at Amazon.com

📍 US, WA, Seattle Unknown Posted April 14, 2026
Type Full-Time
Experience senior
Exp. Years Not specified
Education Not specified
Category Project & Program Management

Amazon Web Services is seeking a Senior Technical Program Manager to improve the health, stability, and operational excellence of new hardware deployments across its data center fleet. This role combines technical program management with strategic operational advisory for AI/ML and high-performance computing infrastructure.

  • Own operational health metrics for new AI/ML hardware platforms
  • Lead root-cause analyses, experiments, and post-mortem processes
  • Manage complex multi-phase infrastructure projects with cross-functional teams
  • Create KPIs and hardware health scorecards for deployment readiness and risk mitigation
  • Serve as operational point of contact, summarizing platform status and path-to-green

The technical scope centers on operational health metrics (failure rate, repair efficacy, repair dwell time) and systematic improvements to break/fix processes. The TPM will lead cross-functional investigations and post-mortems, manage multi-phase infrastructure project schedules/budgets, and translate operational data for teams like EC2.

The ideal candidate is a senior technical program manager with hands-on experience driving hardware deployment operational excellence in a data center environment. They have strong KPI ownership around failure rates and repair performance, lead root-cause and post-mortem processes, and can coordinate cross-functional teams spanning hardware engineering, supply chain, and data center operations.

drive the healthstabilityand operational excellence of new hardware deploymentsown operational health (failure raterepair efficacyrepair dwell timebreak/fix process improvement)lead cross-functional investigationsexperimentsand post-mortem processesmanage complexmulti-phase infrastructure projects across hardware engineeringsupply chaindata center operationsand software teamssummarize platform operational status and path-to-green
technical program managementoperational healthhealth and stability metricsfailure raterepair efficacyrepair dwell timebreak/fix process improvementKPI developmenthardware health scorecardsroot cause analysispost-mortem processescross-functional investigationscapacity planningrisk mitigationdata center operationshardware deploymentshigh-performance computingAI/ML hardware platformsEC2
technical program managementmulti-phase infrastructure project managementhealth and stability metricsfailure rate analysisrepair efficacy analysisrepair dwell time analysisbreak/fix process improvementroot cause analysispost-mortem processesKPI developmenthardware health scorecardscapacity planningrisk mitigationoperational health ownershipplatform operational status reportingdeployment readiness assessmentdata center operationshardware deployment optimizationcross-functional investigationsprocess improvementoperational excellencetranslating complex technical data into actionable insights
cross-functional collaborationcommunicationtechnical advocacystrategic advisorystakeholder managementleadershiprelationship buildingproblem-solving
Industry SaaS
Job Function Drive operational excellence for new AI/ML hardware deployments through technical program management and KPI-driven health ownership.
Role Subtype Technical Project Manager
Tech Domains Amazon Web Services
Sr. Technical Program ManagerSenior Technical Program ManagerTechnical Program Managementtechnical advocatestrategic advisoroperational healthfailure raterepair efficacyrepair dwell timebreak/fix process improvementKPIshardware health scorecardsdeployment readinesscapacity planningrisk mitigationroot cause analysispost-mortem processescross-functional investigationsmulti-phase infrastructure projectsdata center operationshardware engineeringsupply chainEC2electrical? Not specifiedGenAIhigh-performance computinghigh-performance computing infrastructureAI/ML hardware platformsplatform operational statuspath-to-greentechnical program management

Must demonstrate ownership of operational health metrics including failure rate, repair efficacy, and repair dwell time, Must show experience managing multi-phase infrastructure projects with cross-functional stakeholders

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile