Migrate Classic โ Platform SLOs
SLO Migration
Gen2 SLOs used metric-based SLIs. Gen3 SLOs use DQL-based SLIs โ more flexible, queryable, and integrated with Grail.
๐ง Migration Step: Convert Classic SLOs
Step Action Where
โโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 List classic SLOs Service-Level Objectives Classic app
2 Note each SLO's metric, target, window Export or screenshot for reference
3 Create platform SLO from template Service-Level Objectives app โ + SLO
4 Select matching template Service availability, response time, etc.
5 Choose entity and set target Match your Gen2 values
6 Verify SLO value matches Gen2 Compare side-by-side for 1 week
7 Set up burn rate alerting Create anomaly detector on SLO metric
8 Disable classic SLO after validation Don't delete โ disable first
โ ๏ธ Dynatrace is working on a dedicated SLO upgrade guide. Classic SLOs use functional metrics that behave differently during transformation. For now, recreate SLOs manually using the built-in templates โ they handle the DQL SLI generation automatically.
Creating an SLO (UI)
๐ Try it: Ctrl+K โ "Service-Level Objectives" โ Create new โ Choose a template (Service availability, Host CPU, etc.) โ Select entities โ Set target (e.g., 99.5%) โ Save.
Built-in Templates
Template What It Measures
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Host CPU usage utilization CPU idle time across hosts
Service availability Successful request ratio
Service performance Requests faster than X ms
K8s cluster CPU usage efficiency Cluster CPU utilization
K8s cluster memory usage efficiency Cluster memory utilization
K8s namespace CPU/memory efficiency Namespace-level resource usage
Custom SLO with DQL
For custom SLIs, write a DQL timeseries query. The SLI must be a single metric-based timeseries aggregation:
// Host CPU idle (works โ single metric aggregation)
timeseries sli = avg(dt.host.cpu.idle), by:{dt.entity.host}
// Service request count (works โ single metric)
timeseries sli = sum(dt.service.request.count), by:{dt.entity.service}
โ ๏ธ SLI limitation: Arithmetic expressions like (count - failures) / count * 100 FAIL with "parameter has to be a metric-based timeseries aggregation". Use the built-in templates for availability SLOs โ they handle the math internally.
โ ๏ธ API gotcha: The criteria field must be an array (not object). This causes 400 errors if wrong.
Out-of-the-Box SLO Templates
Gen3 includes pre-built SLO templates โ no need to write DQL from scratch:
- Ctrl+K โ "Service-Level Objectives" โ Create new
- Choose a template (service availability, response time, synthetic, etc.)
- Select your entity and target
- The DQL SLI is auto-generated
๐ก After creating from a template, click "Edit SLI" to see the generated DQL โ great way to learn SLO query patterns.
ACE Best Practice: SLO Tiers
The Dynatrace ACE team recommends three SLO tiers based on business criticality:
Tier Target Warning Burn Rate Threshold Use Case
โโโโโโโโ โโโโโโโ โโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโ
High 99.0% 99.5% 4 Revenue-critical services
Medium 98.0% 99.0% 10 Internal business apps
Low 95.0% 98.0% 20 Dev/staging environments
Burn Rate Alerting
Burn rate measures how fast you're consuming your error budget. A burn rate of 1 means you'll exactly exhaust the budget by the end of the window. Higher = faster burn = more urgent.
Burn Rate Meaning Action
โโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ
< 1 Under budget โ healthy None
1-4 Slow burn โ will miss SLO eventually Investigate
4-10 Fast burn โ urgent Page on-call
> 10 Critical โ SLO will fail soon Immediate action
๐ก Create an anomaly detector that monitors SLO burn rate. When burn rate exceeds the tier threshold, trigger a workflow to notify the team. This is proactive SLO management โ issues get fixed before they impact users.