Dynatrace Training Platform

SLO Migration

Gen2 SLOs used metric-based SLIs. Gen3 SLOs use DQL-based SLIs — more flexible, queryable, and integrated with Grail.

🔧 Migration Step: Convert Classic SLOs

Step  Action                                  Where
────  ──────────────────────────────────────  ──────────────────────────────────
1     List classic SLOs                         Service-Level Objectives Classic app
2     Note each SLO's metric, target, window    Export or screenshot for reference
3     Create platform SLO from template          Service-Level Objectives app → + SLO
4     Select matching template                   Service availability, response time, etc.
5     Choose entity and set target               Match your Gen2 values
6     Verify SLO value matches Gen2              Compare side-by-side for 1 week
7     Set up burn rate alerting                  Create anomaly detector on SLO metric
8     Disable classic SLO after validation       Don't delete — disable first

⚠️ Dynatrace is working on a dedicated SLO upgrade guide. Classic SLOs use functional metrics that behave differently during transformation. For now, recreate SLOs manually using the built-in templates — they handle the DQL SLI generation automatically.

Creating an SLO (UI)

🛠 Try it: Ctrl+K → "Service-Level Objectives" → Create new → Choose a template (Service availability, Host CPU, etc.) → Select entities → Set target (e.g., 99.5%) → Save.

Built-in Templates

Template                              What It Measures
────────────────────────────────────  ──────────────────────────────────
Host CPU usage utilization            CPU idle time across hosts
Service availability                  Successful request ratio
Service performance                   Requests faster than X ms
K8s cluster CPU usage efficiency      Cluster CPU utilization
K8s cluster memory usage efficiency   Cluster memory utilization
K8s namespace CPU/memory efficiency   Namespace-level resource usage

Custom SLO with DQL

For custom SLIs, write a DQL timeseries query. The SLI must be a single metric-based timeseries aggregation:

// Host CPU idle (works — single metric aggregation)
timeseries sli = avg(dt.host.cpu.idle), by:{dt.entity.host}

// Service request count (works — single metric)
timeseries sli = sum(dt.service.request.count), by:{dt.entity.service}

⚠️ SLI limitation: Arithmetic expressions like (count - failures) / count * 100 FAIL with "parameter has to be a metric-based timeseries aggregation". Use the built-in templates for availability SLOs — they handle the math internally.

⚠️ API gotcha: The criteria field must be an array (not object). This causes 400 errors if wrong.

Out-of-the-Box SLO Templates

Gen3 includes pre-built SLO templates — no need to write DQL from scratch:

Ctrl+K → "Service-Level Objectives" → Create new
Choose a template (service availability, response time, synthetic, etc.)
Select your entity and target
The DQL SLI is auto-generated

💡 After creating from a template, click "Edit SLI" to see the generated DQL — great way to learn SLO query patterns.

ACE Best Practice: SLO Tiers

The Dynatrace ACE team recommends three SLO tiers based on business criticality:

Tier      Target   Warning   Burn Rate Threshold   Use Case
────────  ───────  ────────  ────────────────────   ──────────────────────
High      99.0%    99.5%     4                      Revenue-critical services
Medium    98.0%    99.0%     10                     Internal business apps
Low       95.0%    98.0%     20                     Dev/staging environments

Burn Rate Alerting

Burn rate measures how fast you're consuming your error budget. A burn rate of 1 means you'll exactly exhaust the budget by the end of the window. Higher = faster burn = more urgent.

Burn Rate    Meaning                              Action
───────────  ──────────────────────────────────   ──────────────────
< 1          Under budget — healthy                None
1-4          Slow burn — will miss SLO eventually  Investigate
4-10         Fast burn — urgent                    Page on-call
> 10         Critical — SLO will fail soon         Immediate action

💡 Create an anomaly detector that monitors SLO burn rate. When burn rate exceeds the tier threshold, trigger a workflow to notify the team. This is proactive SLO management — issues get fixed before they impact users.

Migrate Classic → Platform SLOs

SLO Migration

🔧 Migration Step: Convert Classic SLOs

Creating an SLO (UI)

Built-in Templates

Custom SLO with DQL

Out-of-the-Box SLO Templates

ACE Best Practice: SLO Tiers

Burn Rate Alerting