✦ Luna Orbit — AI & Machine Learning

Product Manager II - Model Lab

at Datadog

📍 New York, New York, USA Hybrid Posted March 11, 2026
Type Full-Time
Experience mid
Exp. Years 4+ years
Education Not specified
Category AI & Machine Learning

This role involves managing Datadog’s experiment tracking platform for AI and ML models, working with research and engineering teams to define product vision, and driving adoption of scalable ML training solutions.

  • Define product vision for Model Lab
  • Lead discovery with AI teams
  • Design experiment tracking system
  • Partner with engineering
  • Drive customer adoption

Focus on ML infrastructure, experiment tracking, distributed training, hyperparameter tuning, and reproducibility workflows, utilizing frameworks like PyTorch, TensorFlow, and JAX.

The ideal candidate is a product manager with at least 4 years of experience in ML infrastructure, experiment tracking, and distributed training, with familiarity in frameworks like PyTorch, TensorFlow, or JAX. They are entrepreneurial, thrive in ambiguity, and can translate complex ML workflows into product features.

Product ManagementExperience with ML infrastructureExperiment trackingDistributed trainingHyperparameter tuning
PyTorchTensorFlowJAXML training workflowsReproducibility workflows
PyTorchTensorFlowJAXML frameworks
Product ManagementExperiment trackingML infrastructureData platformsAI/LLM systemsDistributed trainingHyperparameter tuningArtifact storageEvaluation pipelinesPyTorchTensorFlowJAX
Product ManagementExperiment trackingML infrastructureData platformsAI/LLM systemsDistributed trainingHyperparameter tuningArtifact storageEvaluation pipelinesPyTorchTensorFlowJAX
Entrepreneurial mindsetAmbiguity toleranceCross-functional collaborationTechnical translationPrioritization
Industry SaaS
Job Function Product management for ML experiment tracking and infrastructure
Product ManagementExperiment trackingML infrastructureData platformsAI/LLM systemsDistributed trainingHyperparameter tuningArtifact storageEvaluation pipelinesPyTorchTensorFlowJAXReproducibility workflows

Lack of experience in ML infrastructure or experiment tracking, No familiarity with ML frameworks, Inability to operate in ambiguous environments, No experience with distributed training, Lack of technical translation skills

Apply for this Position →

Get matched to jobs like this

Luna finds roles that fit your skills and career goals — no endless scrolling required.

Create a Free Profile