Opportunity Description
Is this role right for you?
In this role you will:
Design and maintain end-to-end monitoring for critical services using Dynatrace (APM, Real User Monitoring, Synthetic, Davis AI, Smartscape) and GCP Cloud Monitoring (metrics, alerting policies, SLOs/SLIs, uptime checks, dashboards). Build service maps, dependency models, and problem detection in Dynatrace; tune Davis AI problem rules and reduce alert noise through thresholds, baselining, and tagging. Implement SLOs/SLIs with error budgets; continuously review burn rates and align alerting to customer impact. Partner with application teams to instrument code paths (e.g., Dynatrace OneAgent), trace distributed transactions, and validate golden signals (latency, traffic, errors, saturation).Create and optimize Splunk data models, indexes, sourcetypes, ingestion pipelines, and SPL searches; build actionable dashboards for NOC/SRE/Engineering.Develop operational an...
Full-time
Computer Occupations