✦ Luna Orbit — Software Engineering

Untitled Position

at Company

Hybrid Posted March 30, 2026
Type Full-Time
Experience senior
Exp. Years 5+ years
Education Master's degree in Mathematics, Physics, Computer Science, or a relevant STEM field; OR Ph.D. degree in Mathematics, Physics, Computer Science, or a relevant STEM field
Category Software Engineering

Software engineering role focused on developing and optimizing the oneDNN performance library for Intel CPUs/GPUs, with a strong emphasis on linear algebra kernels and HPC.

  • Develop and optimize oneDNN components
  • Collaborate across teams
  • Maintain open-source contributions
  • Implement performance improvements
  • Support LLVM/AI frameworks

Stack includes C/C++, Linux, CUDA/OpenCL, x86 intrinsics; work includes performance tuning of mathematical libraries and ML-related components.

The ideal candidate is a senior software engineer with 5+ years in C/C++, Linux, and high-performance computing, deeply experienced with CUDA/OpenCL and linear algebra libraries, and capable of optimizing oneDNN components for Intel CPUs/GPUs.

Master's degree in MathematicsPhysicsComputer Scienceor a relevant STEM fieldOR Ph.D. degree in MathematicsPhysicsComputer Scienceor a relevant STEM field5+ years of experience in C and C++Maintaining or contributing to open-source software projectsSoftware libraries design and architectureImplementation of linear algebra algorithms (BLASLAPACKor PyTorch)Performance engineering and software performance optimizationsFloating point arithmetic and numerical stabilitySoftware development on LinuxLow-level performance optimizations using CUDAx86 assembly or intrinsicsor OpenCL
3 years+ Machine learning and deep learning algorithms or HPC applications development3 year+ Floating point implementations of transcendental functions (sincostanheluetc)1 year+ Algorithms for non-IEEE low precision data types (bfloat16fp8fp4)1 year+ AI assisted software development
CC++LinuxCUDAOpenCLx86 assemblyintrinsicsBLASLAPACKPyTorchMachine LearningHigh-Performance ComputingOpen-source
CC++LinuxCUDAOpenCLx86 assemblyintrinsicsBLASLAPACK
Written and verbal communicationTeamworkProblem solving
Industry Semiconductors
Job Function Develop and optimize performance-critical library components for oneDNN in AI workloads
Role Subtype Software Engineer
Tech Domains C++, C, Linux, CUDA, OpenCL, BLAS, LAPACK, PyTorch, oneDNN
oneDNNsoftware development engineerC++CLinuxCUDAOpenCLx86 assemblyintrinsicsBLASLAPACKPyTorchMachine LearningHigh-Performance ComputingOpen-sourceJMPJSLPython

Lack of 5+ years C/C++ experience, No experience with Linux or CUDA/OpenCL, No exposure to high-performance computing or ML workloads

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile