✦ Luna Orbit — AI & Machine Learning

Principal Engineer - GPU and LLM Infrastructure

at Wells Fargo

📍 3 Locations Hybrid 💰 $159K – $305K USD / year Posted March 13, 2026
Salary $159K – $305K USD / year
Type Full-Time
Experience lead
Exp. Years 7+ years
Education Not specified
Category AI & Machine Learning

This role involves leading the development and optimization of enterprise GPU platforms for AI workloads, focusing on NVIDIA GPU ecosystems, large language models, and multi-cloud deployment strategies.

  • Design GPU architectures
  • Lead AI platform strategy
  • Optimize GPU workloads
  • Implement high-performance inference pipelines
  • Manage multi-cloud GPU deployments

The position requires deep expertise in NVIDIA GPU hardware, CUDA, LLM/SLM runtimes like Triton and vLLM, GPU orchestration, and performance tuning for high-performance AI inference systems across hybrid cloud environments.

The ideal candidate is a lead engineer with over 7 years of experience in AI infrastructure, specializing in NVIDIA GPU ecosystems, LLM/SLM runtimes, and GPU orchestration. They possess strong expertise in performance tuning, model quantization, and managing large-scale AI deployments across multi-cloud environments.

7+ years of Engineering experienceexperience with NVIDIA GPU and CUDA ecosystemsexperience with LLM/SLM runtimes such as vLLMTensorRTLLMTritonhands-on work with model quantizationGPU resource managementorchestration and GPU workload management
experience with NVIDIA GPU and CUDA ecosystemsexperience with LLM/SLM runtimesmodel quantizationKVcache optimization strategiesdisaggregated prefill/decode pipelinesGPU resource managersOCP/GKE administration
NVIDIA GPUCUDAcuDNNNVLinkNVSwitchMIGNCCLTritonvLLMGKE
NVIDIA GPUCUDAcuDNNNVLinkNVSwitchMIGNCCLGPU profilersperformance tuningLLMSLMTritonvLLMmodel quantizationFP8INT4AWQGPTQKVcachedisaggregated prefilldecode pipelinesorchestrationGPU workload managementOCPGKE
NVIDIA GPUCUDAcuDNNNVLinkNVSwitchMIGNCCLGPU profilersperformance tuningLLMSLMTritonvLLMmodel quantizationFP8INT4AWQGPTQKVcachedisaggregated prefilldecode pipelinesorchestrationGPU workload managementOCPGKE
strategic thinkingleadershipinnovationanalytical thinkingproblem-solving
Industry Technology/AI/Enterprise AI Platforms
Job Function Lead AI infrastructure development for enterprise GPU platforms
NVIDIA GPUCUDAcuDNNNVLinkNVSwitchMIGNCCLGPU profilersperformance tuningLLMSLMTritonvLLMmodel quantizationFP8INT4AWQGPTQKVcachedisaggregated prefilldecode pipelinesorchestrationGPU workload managementOCPGKE

Less than 7 years of engineering experience, No experience with NVIDIA GPU or CUDA, Lack of experience with LLM/SLM runtimes, No hands-on experience with GPU orchestration, Unable to work in a hybrid environment

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile