About this role
Lead architecture and long-term strategy for Prisma AIRS ML inference platform, driving scalable, secure ML deployment and cross-functional collaboration.
Key Responsibilities
- Lead architectural design of ML inference platform
- Provide technical leadership and mentorship
- Drive model and system performance optimization
- Establish automated model deployment standards
- Collaborate with cross-functional teams to ensure end-to-end system cohesion
Technical Overview
Stacks include distributed ML systems on AWS/GCP/Azure/OCI, Kubernetes, Docker, TensorFlow, PyTorch, ONNX, TensorRT, LLM inference engines (vLLM, TensorRT-LLM), CUDA kernels, Triton, Kafka/Spark/Flink; strong CI/CD experience.
Ideal Candidate
The ideal candidate is a senior ML platform engineer with deep experience in MLOps, distributed ML systems, and cloud deployment. They bring strong Python skills, hands-on work with Kubernetes and Docker, and a track record of optimizing large-scale ML inference pipelines in production for security domains.
Must-Have Skills
BS/MS or Ph.D. in Computer Sciencea related technical fieldor equivalent practical experienceExtensive professional experience in software engineering with a deep focus on MLOpsML systemsor productionizing machine learning models at scaleExpert-level programming skills in PythonExperience with a systems language like GoJavaor C++ (nice to have)Deephands-on experience designing and building large-scale distributed systems on a major cloud platform (GCPAWSAzureor OCI)Proven track record of leading the architecture of complex ML systems and MLOps pipelines using Kubernetes and DockerMastery of ML frameworks (TensorFlowPyTorch) and advanced inference optimization tools (ONNXTensorRT)Experience with modern LLM inference engines (e.g.vLLMSGLangTensorRT-LLM) is requiredFamiliarity with CI/CD pipelines and automation tools (e.g.JenkinsGitLab CITekton)
Nice-to-Have Skills
Open-source contributions in ML/AI areasExperience with low-level performance optimization (custom CUDA kernelsTriton Language)Data infrastructure technologies (KafkaSparkFlink)
Tools & Platforms
KubernetesDockerTensorFlowPyTorchONNXTensorRTvLLMSGLangTensorRT-LLMCUDATriton LanguageKafkaSparkFlinkJenkinsGitLab CITektonAmazon Web ServicesGoogle Cloud PlatformAzureOracle Cloud InfrastructureDistributed systemsTransformers
Required Skills
BS/MS or Ph.D. in Computer Science; MLOps; Python; Go; Java; C++; Kubernetes; Docker; TensorFlow; PyTorch; ONNX; TensorRT; vLLM; SGLang; TensorRT-LLM; CUDA; Triton Language; Kafka; Spark; Flink; CI/CD; Jenkins; GitLab CI; Tekton; Cloud platforms (AWSGoogle Cloud PlatformAzureOCI); Distributed systems; Transformers
Hard Skills
PythonGoJavaC++KubernetesDockerTensorFlowPyTorchONNXTensorRTvLLMSGLangTensorRT-LLMCUDATriton Languagecustom CUDA kernelKafkaSparkFlinkJenkinsGitLab CITektonAWSAmazon Web ServicesGCPGoogle Cloud PlatformOCIOracle Cloud InfrastructureDistributed systemsTransformersGNNs
Soft Skills
leadershipmentoringcollaborationcommunicationproblem-solvingstrategic thinkingplanningattention to detailcross-functional collaborationstakeholder management
Keywords for Your Resume
PythonKubernetesDockerTensorFlowPyTorchONNXTensorRTvLLMSGLangTensorRT-LLMCUDATriton LanguageKafkaSparkFlinkJenkinsGitLab CITektonAmazon Web ServicesGoogle Cloud PlatformAzureOracle Cloud InfrastructureDistributed systemsMLOpsTransformers
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile