Kubernetes Issues
Kubernetes problems show up as pod crashes, OOM kills, scheduling failures, and resource exhaustion. Here's how to investigate each with DQL.
Pod Status Overview
// Current pod status across clusters
fetch dt.entity.cloud_application
| fields entity.name, id, k8s.cluster.name, k8s.namespace.name
| sort k8s.cluster.name asc
K8s Events (Warnings)
// Recent K8s warning events
fetch events, from:now()-2h
| filter event.kind == "K8S_EVENT" AND event.type == "Warning"
| fields timestamp, event.reason, event.message, k8s.namespace.name, k8s.pod.name
| sort timestamp desc
| limit 20
Container Resource Usage
// CPU usage vs limits
timeseries cpu=avg(dt.kubernetes.container.cpu_usage),
limits=avg(dt.kubernetes.container.limits_cpu),
by:{k8s.container.name, k8s.namespace.name}, from:now()-1h
// Memory usage vs limits (OOM detection)
timeseries mem=avg(dt.kubernetes.container.memory_working_set),
limits=avg(dt.kubernetes.container.limits_memory),
by:{k8s.container.name, k8s.namespace.name}, from:now()-1h
Key K8s Metrics
Metric What It Measures
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโ
dt.kubernetes.container.cpu_usage Container CPU usage
dt.kubernetes.container.memory_working_set Container memory usage
dt.kubernetes.container.requests_cpu CPU requests
dt.kubernetes.container.limits_cpu CPU limits
dt.kubernetes.container.requests_memory Memory requests
dt.kubernetes.container.limits_memory Memory limits
dt.kubernetes.workload.conditions Pod conditions
dt.kubernetes.pods Pod status
Decision Tree
Pod CrashLoopBackOff? โ Check container logs for startup errors
โ No
OOMKilled? โ Memory limit too low, increase limits_memory
โ No
ImagePullBackOff? โ Check image name, registry auth, network
โ No
Pending (unschedulable)? โ Check node resources, taints, affinity rules
โ No
Running but unhealthy? โ Check readiness/liveness probes, service mesh
Metadata Enrichment
Dynatrace Operator enriches ALL telemetry with K8s metadata. Use these fields for filtering:
k8s.cluster.name,k8s.namespace.name,k8s.pod.namek8s.container.name,k8s.workload.name,k8s.workload.kinddt.security_contextโ for ABAC boundaries based on K8s labelsdt.cost.costcenterโ for cost allocation from K8s annotations
โถ Knowledge Check
Q: A container is OOMKilled. Which metric should you check?
- โ dt.kubernetes.container.cpu_usage
- โ dt.kubernetes.container.memory_working_set vs limits_memory
- โ dt.kubernetes.pods