✦ Luna Orbit — System Administration

Lead, AI Production Services

at AECOM

📍 Dallas, TX Hybrid Posted March 30, 2026
Type Not Specified
Experience lead
Exp. Years 6+ years
Education Bachelor's Degree plus extensive years of experience in enterprise IT operations
Category System Administration

Own and build the enterprise AI Operations practice, ensuring production AI, agentic, and automation solutions are reliable, observable, well-governed, and continually improving.

  • Own the AI Operations strategy and governance
  • Drive production reliability, support, and governance
  • Lead observability and optimization
  • Establish reporting and operational reviews
  • Partner with product, delivery, and technical teams

Mature AI Operations with ITIL-aligned incident/change management, observability (Prometheus, Grafana, OpenTelemetry), and FinOps practices. Leads across AI platforms, governance, and cost optimization.

The ideal candidate is a senior AI operations leader with 6+ years overseeing AI/ML production systems in large-scale environments, strong observability and cost optimization expertise, and experience maturing AI operations as a capability.

Bachelor's degree plus extensive years of experience in enterprise IT operationsservice managementreliability engineeringor production support6+ years of leadership experience in AI/ML/agentic/production systemsProven experience defining and governing operations frameworksDeep knowledge of AI/agentic production challengesITIL practicesobservability (PrometheusGrafanaOpenTelemetry)incident/change management
Hands-on experience with AI platforms like Azure AI FoundryAWS BedrockLangChainor UiPath in productionKnowledge of Responsible AI operations
PrometheusGrafanaOpenTelemetryAzure AI FoundryAmazon Web Services BedrockLangChainUiPath
ITILobservabilityincident managementchange managementproduction readinessAI operationscost optimizationgovernancedashboardsvendor governance
AI OperationsObservabilityITILIncident managementProblem managementChange managementService managementObservability tools (PrometheusGrafanaOpenTelemetry)FinOps / cost optimizationSRE / DevOpsQA and production readinessMetrics and dashboardsVendor governanceExecutive reporting
LeadershipStrategic thinkingCommunicationStakeholder managementMentoring
Industry Construction & Engineering
Job Function Own the Enterprise AI Operations Practice End-to-End
Role Subtype Operations Manager
lead ai production servicesai operationsai opsobservabilityprometheusgrafanaopen telemetryfinopscloud cost optimizationsredevopsproduction reliabilityincident managementchange managementslavendor governancestakeholder managementexecutive reportinghybridremoteAI operationsai production systems

Lack of AI/ML production ops experience, Inability to work in a hybrid work arrangement, No experience with production observability tooling

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile