šŸ“” SNMP Extensions — Module 2

Metrics & Dimensions

Hands-on

From OIDs to Metrics

In Module 1, you learned what OIDs are. Now we'll turn them into Dynatrace metrics. Every metric in your extension needs three things:

  1. A key — unique identifier (e.g., cisco_aci_fabric.cpu_usage)
  2. A value source — the OID to read (e.g., oid:1.3.6.1.4.1.9.9.109.1.1.1.1.8)
  3. A type — gauge (point-in-time) or count (cumulative counter)

Metric Keys

Best practice: prefix with your extension name to avoid collisions.

# Good — unique, identifiable
- key: cisco_aci_fabric.cpu_usage
- key: cisco_aci_fabric.if.in.octets.count

# Bad — too generic, will collide
- key: cpu_usage
- key: bytes_in

Naming rules:

  • Use .count suffix for count metrics (DED006 warning if missing)
  • Do NOT use .count suffix for gauge metrics (DED007 warning)
  • Lowercase, dots as separators
  • Max 250 characters

Metric Types

gauge  — Point-in-time value. CPU %, temperature, memory used, speed.
          Dynatrace stores the value as-is.

count  — Cumulative counter. Bytes transferred, packets, errors.
          Dynatrace calculates the rate (delta between polls).

Rule of thumb: if the value only goes up (until reset), it's a count. If it fluctuates, it's a gauge.

Metric Metadata

Define display names and units in the metrics: section at the top of your extension.yaml:

metrics:
  - key: my_device.cpu_usage
    metadata:
      displayName: CPU Usage (5 min avg)
      description: Average CPU utilization over 5 minutes
      unit: Percent

  - key: my_device.if.in.octets.count
    metadata:
      displayName: Interface Bytes In
      description: Total bytes received on interface
      unit: Byte

  - key: my_device.temperature
    metadata:
      displayName: Temperature
      description: Current sensor temperature
      unit: Celsius

Common units: Percent, Byte, KiloByte, MegaByte, Count, MilliSecond, Celsius, Ampere, MegaBit, Hour

Dimensions

Dimensions are labels attached to metrics. They identify which device, interface, or sensor a metric belongs to.

dimensions:
  # From the device's IP (monitoring config)
  - key: device.address
    value: this:device.address

  # From an SNMP OID (scalar)
  - key: sys.name
    value: oid:1.3.6.1.2.1.1.5.0

  # Constant value
  - key: device.type
    value: const:Cisco Switch

  # From a user variable (monitoring config)
  - key: custom.tag
    value: var:ext.activationtag

Four dimension value sources:

  • oid: — Read from SNMP
  • this: — From monitoring config (device.address, device.port)
  • const: — Hardcoded value
  • var: — User-configurable variable

Putting It Together

Here's a real SNMP group that polls CPU, memory, and sysUpTime from a Cisco device:

name: custom:com.dynatrace.extension.my-cisco-device
version: 0.0.1
minDynatraceVersion: "1.318.0"
author:
  name: Student

metrics:
  - key: my_cisco.sysuptime
    metadata:
      displayName: System Uptime
      unit: Count
  - key: my_cisco.cpu_usage
    metadata:
      displayName: CPU Usage (5 min)
      unit: Percent
  - key: my_cisco.memory_used
    metadata:
      displayName: Memory Used
      unit: Byte
  - key: my_cisco.memory_free
    metadata:
      displayName: Memory Free
      unit: Byte

snmp:
  - group: Device Default
    interval:
      minutes: 1
    dimensions:
      - key: device.address
        value: this:device.address
      - key: sys.name
        value: oid:1.3.6.1.2.1.1.5.0
      - key: device.type
        value: const:Cisco Device
    metrics:
      - key: my_cisco.sysuptime
        value: oid:1.3.6.1.2.1.1.3.0
        type: gauge
    subgroups:
      - subgroup: CPU and Memory
        featureSet: CPU and Memory
        table: true
        dimensions:
          - key: cpu.index
            value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.1
        metrics:
          - key: my_cisco.cpu_usage
            value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.8
            type: gauge
          - key: my_cisco.memory_used
            value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.12
            type: gauge
          - key: my_cisco.memory_free
            value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.13
            type: gauge

Let's break down what's happening:

  1. Group-level dimensions (device.address, sys.name, device.type) are attached to ALL metrics in this group
  2. Group-level metric (sysuptime) is a scalar OID (ends in .0) — no subgroup needed
  3. Subgroup with table: true — CPU/Memory OIDs are table OIDs (one row per CPU core)
  4. featureSet makes this subgroup toggleable in the monitoring config UI
  5. Subgroup dimension (cpu.index) identifies which CPU core each row belongs to

Dimension Filters

Filter out unwanted rows from table walks:

dimensions:
  - key: if.name
    value: oid:1.3.6.1.2.1.31.1.1.1.1
    filter: const:$not($eq(n/a))          # Skip "n/a" interfaces

  - key: if.descr
    value: oid:1.3.6.1.2.1.2.2.1.2
    filter: var:ifNameFilter               # User-configurable filter

Filter expressions: $prefix(), $suffix(), $contains(), $eq(), $not(), $and(), $or()

Common Mistakes

Mistake                                    Error    Fix
─────────────────────────────────────────  ───────  ──────────────────────────
Scalar OID (.0) in table: true subgroup    DED016   Remove .0 or set table: false
Table OID (no .0) in table: false group    DED017   Add .0 or set table: true
OIDs from different tables in same group   DED018   Split into separate subgroups
Metric key defined but never used          DED020   Remove from metrics: section
Metric key used but never defined          DED021   Add to metrics: section
count metric without .count suffix         DED006   Add .count to key name
gauge metric with .count suffix            DED007   Remove .count from key name

What's Next

In Module 3, we'll organize metrics into multiple groups and feature sets — essential for fault isolation and giving users control over what gets monitored.

šŸ›  Hands-On Exercise

Edit the YAML in the editor, then click "Check My Work" to validate.

Metrics & Dimensions

Build a complete interface monitoring subgroup with proper metric types and dimensions.

  • Add gauge metrics for interface speed and admin status
  • Add count metrics for inbound/outbound octets (remember the .count suffix rule!)
  • Add an if.name dimension from ifXTable
  • Add a filter to exclude interfaces named n/a

Watch for DED006 (count naming) and DED007 (gauge naming) warnings.

extension.yamlYAML
Loading...