✦ Luna Orbit — AI & Machine Learning

Sr. Software Development Engineer (GPU Machine Learning Performance)

at Advanced Micro Devices

📍 Austin, Texas, United States Hybrid Posted March 18, 2026
Type Full-Time
Experience senior
Exp. Years 5+ years
Education Not specified
Category AI & Machine Learning

This role involves developing and optimizing GPU-based machine learning inference solutions, focusing on performance analysis, workload profiling, and hardware-software collaboration to meet next-generation AI performance goals.

  • Work on GPU ML workloads
  • Profile inference pipelines
  • Optimize transformer models
  • Collaborate with compiler and hardware teams
  • Develop performance strategies

The position requires expertise in GPU performance profiling, ML workload development, transformer model optimization, and collaboration with hardware and compiler teams, utilizing tools like CUDA and profiling software.

The ideal candidate is a senior AI/ML engineer with extensive experience in GPU performance analysis, workload optimization, and hardware-software collaboration, particularly in AI inference solutions.

GPU performance analysisMachine Learning workload developmentprofiling end-to-end inference pipelinesoptimizing execution paths for transformer modelscollaborating with hardware and software teams
GPU edge inference solutionspre and post-silicon developmentmemory layout optimizationdistributed systemscompiler collaboration
GPUCUDAprofilersperformance analysis toolscompiler tools
GPUMachine Learningperformance profilingworkload optimizationGPU performance analysistransformer modelsquantizationdistributed executioncompilerkernel developmenthardware architecture
GPUMachine LearningPerformance profilingWorkload optimizationGPU performance analysisTransformer modelsQuantizationDistributed executionCompilerKernel developmentHardware architecture
collaborationproblem-solvingperformance optimizationteamworkcommunication
Industry Technology
Job Function Developing and optimizing GPU-based machine learning inference solutions
Role Subtype AI & Machine Learning
Tech Domains GPU, Machine Learning, Hardware architecture
GPUMachine Learningperformance profilingworkload optimizationGPU performance analysistransformer modelsquantizationdistributed executioncompilerkernel developmenthardware architectureAIMLGPU edge inferencepre-siliconpost-siliconmachine learningprofilingdistributed systemskernelperformance optimizationedge inference

Lack of experience in GPU performance profiling, No background in machine learning workloads, No experience with hardware architecture

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile