✦ Luna Orbit — AI & Machine Learning

Director, Software Engineering

at Walmart

📍 (USA) ISD Office - DGTC AR BENTONVILLE Home Office Unknown 💰 $130K – $260K USD / year Posted April 14, 2026
Salary $130K – $260K USD / year
Type Full-Time
Experience executive
Exp. Years Not specified
Education Bachelor's degree
Category AI & Machine Learning

Walmart is hiring a Director of Software Engineering to lead the strategy, architecture, and execution of the Agent AI Platform powering Sparky and next-gen customer experiences. The role focuses on scaling agent orchestration, RAG, evaluation, safety/guardrails, observability, and real-time inference across products.

  • Lead architecture and development of Sparky Agent AI Platform
  • Design and scale distributed systems for LLMs and retrieval-augmented generation (RAG)
  • Define platform capabilities (context management, evaluation pipelines, safety/guardrails, observability)
  • Deliver production-ready AI platform features with operational maturity (SLOs, incident management)
  • Lead and mentor backend and platform engineering teams and drive platform roadmaps

You will design and scale distributed systems for LLMs, retrieval-augmented generation (RAG), knowledge bases, and agent orchestration, with strong emphasis on real-time AI inference. You will also establish evaluation pipelines, safety/guardrails, observability, SLOs, incident management, and CI/CD for cloud-native deployments.

The ideal candidate is an executive-level engineering leader with deep expertise in agent-based AI platforms, including LLMs, retrieval-augmented generation (RAG), and real-time AI inference. They have led platform architecture and delivery at scale, built evaluation and safety/guardrails frameworks, and managed backend/platform teams with strong operational practices (SLOs, incident management, performance monitoring).

Lead architecture and development of Sparky Agent AI PlatformDesign and scale distributed systems supporting LLMsretrieval-augmented generation (RAG)agent orchestrationreal-time AI inferenceDefine platform capabilities including context managementevaluation pipelinessafety/guardrailsobservabilityand cost-efficient inferenceEstablish engineering best practiceshigh code qualityand strong operational maturity (SLOsincident managementperformance monitoring)Improve developer workflowsCI/CD pipelinesand cloud-native deployment processesLeadgrowand mentor a team of backend and platform engineers
CI/CD pipelinescloud-native deployment processes
Sparky Agent AI Platformagent orchestrationdistributed systemsLLMsretrieval-augmented generation (RAG)real-time AI inferenceevaluation pipelinessafety/guardrailsobservabilitySLOsincident managementperformance monitoringCI/CD pipelinescloud-native deploymentplatform roadmaps
AI technology leadershiparchitecturedevelopmentdistributed systemsLLMsretrieval-augmented generation (RAG)retrieval systemsknowledge basesagent orchestrationreal-time AI inferencecontext managementevaluation pipelinessafety/guardrailsobservabilitycost-efficient inferenceperformancereliabilityaccessibilitysecuritySLOsincident managementperformance monitoringCI/CD pipelinescloud-native deployment processesmodel integrationexperimentationcontinuous improvementbackend engineeringplatform engineeringengineering best practiceshigh code qualityoperational maturitytechnical roadmapspeople leadershipmentoringteam planninggoal settingcareer developmentperformance managementcollaborationevaluation frameworksenterprise-wide AI strategy
leadershipmentoringgrow and mentor a teamcross-functional collaborationtechnical excellence culturehigh performance cultureinnovation cultureaccountabilitylearningcommunicationpartnership with Product and UXpartnering with Data Science and Research teams
Industry Retail
Job Function Lead and scale Walmart's agent AI platform engineering
Role Subtype VP of Engineering
Tech Domains Azure, Amazon Web Services, AI & Machine Learning, Kubernetes, DevOps & SRE
DirectorSoftware EngineeringDirector of EngineeringEngineering leadershipSparky Agent AI Platformagent orchestrationLLMsretrieval-augmented generation (RAG)knowledge basesreal-time AI inferencedistributed systemscontext managementevaluation pipelinessafety/guardrailsobservabilitycost-efficient inferenceperformance monitoringincident managementSLOsCI/CD pipelinescloud-native deploymentplatform roadmapsbackend engineersplatform engineersmentoringtechnical strategy

Must have demonstrated experience with LLMs and retrieval-augmented generation (RAG), Must have experience leading architecture and development of agent orchestration platforms, Must have led teams and execution delivery at a director/executive level, Must have experience with operational maturity concepts like SLOs and incident management

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile