Position Details

Type Full-Time

Experience senior

Exp. Years Not specified

Education Not specified

Category AI & Machine Learning

About this role

This role involves advancing the Triton compiler and runtime for AMD GPUs, focusing on distributed execution, communication, and performance optimization for AI workloads.

Key Responsibilities

Develop AMD GPU backend for Triton; Build distributed communication capabilities; Optimize kernel performance; Collaborate with hardware teams; Enhance compiler infrastructure

Technical Overview

The technical environment includes GPU architecture, compiler backends like MLIR and LLVM, AMD Instinct accelerators, and performance tuning tools, working within AMD's Triton ecosystem.

Ideal Candidate

The ideal candidate is a senior AI/ML software engineer with deep expertise in GPU architecture, compiler technologies, and distributed systems. They should have experience optimizing GPU kernels, working with AMD Instinct accelerators, and developing scalable AI training and inference solutions.

Must-Have Skills

GPU architectureCompiler technologiesDistributed GPU systemsCC++PythonExperience with GPU runtime or communication stack

Nice-to-Have Skills

MLIRLLVMKernel optimizationHardware utilizationAMD Instinct GPUs

Tools & Platforms

ROCmLLVMMLIRGitLinux

Required Skills

GPU architectureCompiler TechnologiesDistributed GPU SystemsGPU RuntimeCommunication StackMLIRLLVMKernel OptimizationHardware UtilizationAMD Instinct AcceleratorsPerformance EngineeringCC++PythonGPU CommunicationDistributed Execution

Hard Skills

GPU architectureCompiler TechnologiesDistributed GPU SystemsGPU RuntimeCommunication StackCompiler BackendMLIRLLVMKernel OptimizationAMD Instinct AcceleratorsPerformance EngineeringHardware UtilizationCC++PythonGPU CommunicationDistributed Execution

Soft Skills

Problem-SolvingCollaborationTechnical CommunicationAnalytical ThinkingInnovation

Industry & Role

Industry Semiconductors & Electronics

Job Function Enhancing Triton compiler and runtime for AMD GPUs in AI applications

Role Subtype AI & Machine Learning

Tech Domains Linux, LLVM, MLIR, GPU, Compiler Technologies, Performance Engineering

Keywords for Your Resume

GPU architectureCompiler technologiesDistributed GPU systemsGPU runtimeCommunication stackMLIRLLVMKernel optimizationHardware utilizationAMD InstinctPerformance engineeringCC++PythonGPU communicationDistributed executionAI frameworksPyTorchLarge-scale AIDistributed training

Deal Breakers

Lack of experience with GPU compiler or runtime, No background in GPU architecture or performance engineering, Unable to work in a hybrid environment, No experience with AMD GPUs

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Senior Software Engineer - AI Triton Communication

Get matched to jobs like this