Position Details
About this role
This role involves designing and architecting scalable AI and HPC clusters with a focus on power, storage, and networking components for data centers. The engineer will evaluate and select hardware to optimize performance and reliability.
Key Responsibilities
- Design scalable AI/HPC clusters including compute, storage, and networking
- Evaluate and select CPUs, GPUs, accelerators, interconnects, and memory configurations
- Design power delivery solutions for high-density deployments
- Define power budgets, redundancy, and fault tolerance
- Collaborate with cross-functional teams
Technical Overview
The technical environment includes high-performance compute hardware, storage solutions like Lustre and Ceph, networking topologies, and power delivery systems, with a focus on scalable and reliable infrastructure design.
Ideal Candidate
The ideal candidate is a mid-level systems engineer with experience in HPC, AI infrastructure, and data center systems. They possess strong technical knowledge of compute, storage, networking, and power delivery components, with the ability to design scalable and reliable infrastructure.
Must-Have Skills
Nice-to-Have Skills
Required Skills
Hard Skills
Soft Skills
Industry & Role
Keywords for Your Resume
Deal Breakers
Lack of experience in HPC or data center engineering, No knowledge of power delivery systems, No experience with compute or storage components
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile