Groups & Feature Sets
Hands-onWhy Multiple Groups Matter
In Module 2, we put everything in one group. That works, but it has a critical flaw: if one subgroup fails, ALL metrics in the group stop.
This is a real production bug. Here's what happened with a Cisco ACI spine switch:
Device: 10.250.11.51 (spine switch)
Problem: SNMP agent bug on ifIndex 402718780
Effect: ifDescr GETBULK walk hangs for 180 seconds (5 retries × 30s timeout)
Result: ALL table metrics blocked — CPU, memory, PSU, temperature = ZERO DATA
Meanwhile, the two leaf switches (.59, .60) with no buggy interfaces
reported everything perfectly.
The fix? Separate SNMP groups. Each group polls independently. If the interface walk hangs, CPU/memory/PSU/temperature still get collected.
Group Architecture
snmp:
# Group 1: Device health (CPU, memory, PSU, temp)
- group: Device Default
interval:
minutes: 1
dimensions: [...]
metrics: [...] # Scalar metrics (sysUpTime)
subgroups:
- subgroup: CPU and Memory
- subgroup: Power Supply
- subgroup: Temperature
# Group 2: Interfaces (separate polling — fault isolated)
- group: Interfaces
interval:
minutes: 1
dimensions: [...]
subgroups:
- subgroup: Interface Status
- subgroup: Interface Counters
Now if the interface walk hangs on a buggy device, Group 1 still polls successfully. You get CPU, memory, PSU, and temperature data even when interfaces are broken.
Feature Sets
Feature sets let users toggle groups of metrics on/off in the monitoring configuration UI.
subgroups:
- subgroup: CPU and Memory
featureSet: CPU and Memory # ← User can enable/disable this
table: true
metrics:
- key: my_device.cpu_usage
value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.8
type: gauge
- subgroup: Temperature
featureSet: Temperature # ← Separate toggle
table: true
metrics:
- key: my_device.sensor_value
value: oid:1.3.6.1.4.1.9.9.91.1.1.1.1.4
type: gauge
In the monitoring configuration JSON:
{
"featureSets": ["CPU and Memory", "Temperature"]
}
Or use "featureSets": ["all"] to enable everything.
Rules:
- Metrics NOT in any featureSet are always reported (default metrics)
- A metric inherits the featureSet of its subgroup, which inherits from its group
- Metric-level featureSet overrides subgroup-level, which overrides group-level
- Max 10 groups per extension, 10 subgroups per group
Complete Multi-Group Example
Here's a production-quality extension structure with fault isolation:
name: custom:com.dynatrace.extension.my-network-device
version: 0.0.2
minDynatraceVersion: "1.318.0"
author:
name: Student
metrics:
- key: my_device.sysuptime
metadata:
displayName: System Uptime
unit: Count
- key: my_device.cpu_usage
metadata:
displayName: CPU Usage
unit: Percent
- key: my_device.memory_used
metadata:
displayName: Memory Used
unit: Byte
- key: my_device.psu.status
metadata:
displayName: PSU Status
unit: Count
- key: my_device.sensor_value
metadata:
displayName: Sensor Temperature
unit: Celsius
- key: my_device.if.speed
metadata:
displayName: Interface Speed
unit: MegaBit
- key: my_device.if.in.octets.count
metadata:
displayName: Bytes In
unit: Byte
- key: my_device.if.out.octets.count
metadata:
displayName: Bytes Out
unit: Byte
snmp:
# ─── GROUP 1: Device Health (fault-isolated from interfaces) ───
- group: Device Default
interval:
minutes: 1
dimensions:
- key: device.address
value: this:device.address
- key: device.port
value: this:device.port
- key: sys.name
value: oid:1.3.6.1.2.1.1.5.0
- key: sys.description
value: oid:1.3.6.1.2.1.1.1.0
- key: device.type
value: const:Network Device
metrics:
- key: my_device.sysuptime
value: oid:1.3.6.1.2.1.1.3.0
type: gauge
subgroups:
- subgroup: CPU and Memory
featureSet: CPU and Memory
table: true
dimensions:
- key: cpu.index
value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.1
metrics:
- key: my_device.cpu_usage
value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.8
type: gauge
- key: my_device.memory_used
value: oid:1.3.6.1.4.1.9.9.109.1.1.1.1.12
type: gauge
- subgroup: Power Supply
featureSet: Power Supply
table: true
dimensions:
- key: psu.descr
value: oid:1.3.6.1.2.1.47.1.1.1.1.2
metrics:
- key: my_device.psu.status
value: oid:1.3.6.1.4.1.9.9.117.1.1.2.1.2
type: gauge
- subgroup: Temperature
featureSet: Temperature
table: true
dimensions:
- key: sensor.descr
value: oid:1.3.6.1.2.1.47.1.1.1.1.2
metrics:
- key: my_device.sensor_value
value: oid:1.3.6.1.4.1.9.9.91.1.1.1.1.4
type: gauge
# ─── GROUP 2: Interfaces (polls independently) ───
- group: Interfaces
interval:
minutes: 1
dimensions:
- key: device.address
value: this:device.address
- key: sys.name
value: oid:1.3.6.1.2.1.1.5.0
subgroups:
- subgroup: Interface Status
featureSet: Interfaces
table: true
dimensions:
- key: if.name
value: oid:1.3.6.1.2.1.31.1.1.1.1
- key: if.alias
value: oid:1.3.6.1.2.1.31.1.1.1.18
metrics:
- key: my_device.if.speed
value: oid:1.3.6.1.2.1.31.1.1.1.15
type: gauge
- key: my_device.if.in.octets.count
value: oid:1.3.6.1.2.1.31.1.1.1.6
type: count
- key: my_device.if.out.octets.count
value: oid:1.3.6.1.2.1.31.1.1.1.10
type: count
Cross-Table Bug (DED018)
This is the most common and dangerous bug in SNMP extensions. It happens when you put OIDs from different SNMP tables in the same table: true subgroup.
# ✗ WRONG — mixing ifTable OIDs with ipAdEntTable OIDs
- subgroup: Interface Info
table: true
dimensions:
- key: if.name
value: oid:1.3.6.1.2.1.31.1.1.1.1 # ifXTable (indexed by ifIndex)
metrics:
- key: my_device.if.speed
value: oid:1.3.6.1.2.1.31.1.1.1.15 # ifXTable ✓
- key: my_device.if.ipaddr
value: oid:1.3.6.1.2.1.4.20.1.1 # ipAdEntTable ✗ DIFFERENT TABLE!
The problem: ifXTable is indexed by ifIndex (1, 2, 3...) but ipAdEntTable is indexed by IP address (10.0.0.1, 10.0.0.2...). GETBULK walks them together and the rows don't align.
# ✓ CORRECT — separate subgroups for different tables
- subgroup: Interface Status
table: true
dimensions:
- key: if.name
value: oid:1.3.6.1.2.1.31.1.1.1.1
metrics:
- key: my_device.if.speed
value: oid:1.3.6.1.2.1.31.1.1.1.15
- subgroup: IP Addresses
table: true
dimensions:
- key: if.ipaddr
value: oid:1.3.6.1.2.1.4.20.1.1
metrics:
- key: my_device.if.ipadentifindex
value: oid:1.3.6.1.2.1.4.20.1.2
type: gauge
Compatible table combinations (same index space, safe to mix):
ifTable+ifXTable— both indexed by ifIndexentPhysicalTable+entSensorValueTable— both indexed by entPhysicalIndexentPhysicalTable+cefcFRUPowerStatusTable— both indexed by entPhysicalIndex
What's Next
In Module 4, we'll dive deeper into advanced SNMP patterns — MIB bundling, $networkFormat for address translation, variables for user-configurable filtering, and debugging SNMP polling issues from ActiveGate logs.
🛠 Hands-On Exercise
Edit the YAML in the editor, then click "Check My Work" to validate.
Fault Isolation with Groups
This extension has the ACI cross-table bug — all metrics are in one group, so a hanging interface walk blocks CPU/memory data.
- Split into two groups:
Device Default(CPU, memory, sysUpTime) andInterfaces(interface metrics) - Add
featureSet:labels to each subgroup so they can be toggled independently - Make sure the Interfaces group has its own
intervalanddimensions
This is the exact fix we shipped in ACI v0.0.5.