About this role
Own and evolve end-to-end observability using Datadog within an SRE framework. Implement monitoring standards, operate telemetry platforms, and support incident response and post-incident improvements across production systems.
Key Responsibilities
- Own and evolve end-to-end observability using Datadog
- Design and enforce monitoring standards (alert quality, Golden signals, SLO/SLA)
- Serve as primary Datadog platform specialist (dashboards, monitors, service catalog, integrations)
- Support production incident response using Datadog and Splunk
- Partner with engineering teams to improve instrumentation and adopt OpenTelemetry
Technical Overview
This is a tool-focused observability role centered on Datadog capabilities: APM, distributed tracing, DBM, log ingestion/parsing/correlation, and synthetic monitoring (plus RUM where applicable). You will apply Golden signals and SLO/SLA-aligned monitoring, integrate telemetry with ServiceNow and PagerDuty, and adopt OpenTelemetry for instrumentation.
Ideal Candidate
The ideal candidate is a hands-on observability/SRE engineer with strong Datadog expertise, including APM, distributed tracing, and DBM. They can own monitoring standards (Golden signals, SLO/SLA alignment), operate telemetry pipelines, and lead incident triage using Datadog and Splunk while integrating with ServiceNow and PagerDuty.
Must-Have Skills
strong experience using Datadoghands-on implementation and operation of enterprise telemetry platformsAPMDistributed TracingDBMlog ingestionparsingpipelinesand correlationsynthetic monitoringmonitoring standards with Golden signals and SLO/SLA-aligned monitoringprimary Datadog platform specialist responsibilities (dashboardsmonitorsservice catalogintegrations)
Tools & Platforms
DatadogSplunkServiceNowPagerDutyOpenTelemetry
Required Skills
Site Reliability Engineering (SRE)DatadogObservabilityAPMDistributed TracingDBMlog ingestionlog parsingpipelinescorrelationSynthetic monitoringRUMAI-driven alertingWatchdoganomaly detectionGolden signalsSLOSLAtelemetry hygienedashboardsmonitorsservice catalogintegrationsSplunkincident responseroot-cause analysispost-incident reviewsServiceNowPagerDutyOpenTelemetryDR testingoperational readiness reviews
Hard Skills
Site Reliability Engineering (SRE)observabilityDatadogAPM (Application Performance Monitoring)distributed tracingDBM (Database Monitoring)log ingestionlog parsingpipelinescorrelationsynthetic monitoringRUMAI-driven alertingWatchdoganomaly detectionmonitoring standardsalert qualitysignal-to-noise reductionGolden signalsSLO (Service Level Objective)SLA (Service Level Agreement)monitoring taggingtelemetry hygienedashboardsmonitorsservice catalogintegrationscost visibilitycost optimizationSplunkincident responseroot-cause analysispost-incident reviewsITSM toolsServiceNowPagerDutyOpenTelemetryinstrumentation (APMcustom metricslogs)DR testingoperational readiness reviews
Soft Skills
hands-on executionincident response leadershipcross-functional collaborationenablement and onboardingcommunication with application teamscontinuous improvement mindset
Keywords for Your Resume
Site Reliability EngineerSite Reliability Engineering (SRE)Observability EngineerDatadogAPMApplication Performance MonitoringDistributed TracingDBMDatabase Monitoringlog ingestionlog parsinglog pipelinescorrelationSynthetic monitoringRUMReal User MonitoringAI-driven alertingWatchdoganomaly detectionmonitoring standardssignal-to-noise reductionGolden signalsSLOService Level ObjectiveSLAService Level Agreementtelemetry hygienedashboardsmonitorsservice catalogintegrationsSplunkincident responseroot-cause analysispost-incident reviewsServiceNowPagerDutyITSMOpenTelemetryDR testingoperational readiness reviewsAPM (Application Performance Monitoring)DBM (Database Monitoring)SLO (Service Level Objective)
Deal Breakers
Strong experience using Datadog, Ability to operate Datadog for APM, Distributed Tracing, and DBM, Experience with log ingestion/parsing/correlation and synthetic monitoring
Get matched to jobs like this
Luna finds roles that fit your skills and career goals — no endless scrolling required.
Create a Free Profile