✦ Luna Orbit — AI & Machine Learning

Principal Architect – HPC & AI (NVidia Ecosystem)

at World Wide Technology

📍 Remote, US Remote 💰 $215K – $245K USD / year Posted April 10, 2026
Salary $215K – $245K USD / year
Type Full-Time
Experience lead
Exp. Years 10+ years
Education Bachelor's degree in a technical field or equivalent hands-on experience
Category AI & Machine Learning

Principal Architect for HPC and AI within the NVIDIA ecosystem, responsible for end-to-end design and delivery of GPU-accelerated AI/HPC platforms across large-scale data centers and AI factories.

  • Lead end-to-end architecture of GPU-accelerated HPC and AI platforms
  • Architect integrated Compute/Networking/Storage using NVIDIA HGX and DGX
  • Design storage for AI training/inference and HPC
  • Provide hands-on leadership during implementation
  • Maintain high-quality architectural documentation

Architects GPU-accelerated compute across HGX/DGX, plans storage integrations (VAST, NetApp, WEKA, Lustre), orchestrates with BCM/Slurm/Run:AI, uses Kubernetes and Linux, and addresses cooling and power considerations for data centers.

The ideal candidate is a senior lead/ principal architect with 10+ years of HPC/AI architecture experience, expert in NVIDIA data center ecosystem (HGX/DGX), and a track record of designing large-scale GPU-accelerated AI/HPC platforms. They should be able to mentor engineers, author architectural documentation, and drive multi-site deployments.

Expert level with deep architectural knowledge of NVIDIA data center platformsincluding HGX and DGXGPU-accelerated compute architecture for AI and HPC workloadsHigh-performance networking architecturesespecially with Spectrum-XLarge-scale AI factory and HPC platform designHands-on architectural experience with high-performance parallel or scale-out storage systemsHands-on experience with storage platforms such as VAST DataNetappWEKADDNLustreNVIDIA Base Command Manager (BCM) for cluster lifecycle managementSlurm for HPC workload schedulingRun:AI for GPU orchestration and multi-tenant AI workload optimizationKubernetes administrationLinux systems administrationContainerized AI workflowsExperience optimizing HPC/AI platforms for performance/utilization/costSenior individual contributor with technical authorityMentoring engineers and architects
Multi-siteair-gapped or regulated environmentsLiquid coolingpower/cooling designdata center integration
NVIDIA Base Command ManagerSlurmRun:AIKubernetesLinuxVAST DataNetAppWEKADDNLustre
NVIDIA data center platformsHGXDGXGPU-accelerated computeSpectrum-Xhigh-performance storageVAST DataNetappWEKADDNLustreNVIDIA Base Command Manager (BCM)SlurmRun:AIKubernetesLinuxcontainerized AI workflowsGrace CPU architecturesmulti-site environments
NVIDIA data center platformsHGXDGXGPU-accelerated computeSpectrum-Xhigh-performance parallel storageVAST DataNetappWEKADDNLustreNVIDIA Base Command Manager (BCM)SlurmRun:AIKubernetesLinuxContainerized AI workflowsGrace CPU architecturesNVIDIA Base Command ManagerLustrePower/cooling design
technical authoritymentoringautonomyleadership without people managementcross-functional collaboration
Industry Consulting
Job Function Lead architect for HPC and AI platforms built on the NVIDIA data center ecosystem
Role Subtype Principal Architect
Tech Domains Linux, Kubernetes, Slurm, NVIDIA Base Command Manager, Lustre, VAST Data, Netapp, WEKA, DDN, DGX, HGX
principal architecthpcainvidianvidia data centerhgxdgxslurmrun:aikuberneteslinuxlustrevast datanetappwekaddngrace cpuair-gappeddata center architecturegpu orchestrationNVIDIA data center platformsHGXDGXGPU-accelerated computeSpectrum-XBCMSlurmRun:AIKubernetesLustre

10+ years in HPC, Data Center Architecture, and/or Systems Engineering, Experience acting as a senior technical authority, Strong ability to mentor engineers and architects

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile