Position Details

Type Not Specified

Experience mid

Exp. Years 5+ years

Education Not specified

Category Software Engineering

About this role

This role involves designing and building scalable deployment infrastructure for AI inference models across hardware accelerators, ensuring continuous, unattended deployment with high efficiency.

Key Responsibilities

Own deployment orchestration
Improve deployment scheduling
Extend deployment observability
Drive down cycle time
Optimize fleet rollout strategies

Technical Overview

Focuses on AI inference deployment systems, resource management, automation, and monitoring across GPU, TPU, and Trainium hardware, with a strong emphasis on systems engineering and scalability.

Ideal Candidate

The ideal candidate is a software engineer with over 5 years of experience in building deployment and automation systems at scale, particularly involving inference deployment on accelerators like GPUs, TPUs, and Trainium. They should have strong systems design skills and experience managing resource-constrained environments.

Must-Have Skills

5+ years of experience building deploymentreleaseor delivery infrastructure at scaleStrong software engineering skillsExperience designing systems managing complex state machines and multi-stage pipelinesExperience with deployment systems where resource constraints shape the design

Nice-to-Have Skills

Experience with GPUsTPUsTrainiumKnowledge of continuous deploymentMonitoring and observability toolsAutomation tooling

Tools & Platforms

GPUTPUTrainiumDashboardsPipeline architectures

Required Skills

Deployment systemsInference deploymentGPUTPUTrainiumResource managementAutomationPipeline architecturesOrchestrationMonitoring dashboardsSoftware engineering

Hard Skills

Deployment systemsInference deploymentGPUTPUTrainiumResource managementAutomationPipeline architecturesOrchestrationDeployment infrastructureMonitoring dashboardsSoftware engineering

Soft Skills

Problem-solvingAutomationCollaborationResource managementCommunication

Industry & Role

Industry Technology

Job Function Build and manage AI inference deployment infrastructure at scale

Keywords for Your Resume

Deployment systemsInference deploymentGPUTPUTrainiumResource managementAutomationPipeline architecturesOrchestrationMonitoring dashboardsSoftware engineeringValidationScaling infrastructureContinuous deploymentResource constraints

Deal Breakers

Less than 5 years of relevant experience, Lack of experience with deployment systems at scale, No experience with GPU, TPU, or Trainium hardware

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile

Software Engineer, Inference Deployment

Get matched to jobs like this