Grail Buckets & Data Partitioning
Grail Buckets & Data Partitioning
In Gen2, all data had the same retention. In Gen3, Grail buckets let you store different data for different durations, route data to different storage locations, and control query costs. This is the "data partitioning" leg of the MZ replacement triangle.
What Are Buckets?
Gen2 Gen3
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
All logs: same retention (35 days) Each bucket: independent retention
All data in one "pool" Route data to specific buckets
No cost control per team Query costs trackable per bucket
No compliance separation Compliance data in dedicated bucket
A bucket is a logical storage unit with:
- Independent retention โ 1 day to 10 years
- Independent access โ ABAC policies can scope to specific buckets
- Independent routing โ OpenPipeline routes data to buckets based on conditions
- Query cost tracking โ know which bucket costs how much to query
Default Buckets
Bucket Table Default Retention Notes
โโโโโโโโโโโโโโโโโโโ โโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโ
default_logs logs 35 days All logs go here by default
default_events events 35 days Davis events, custom events
default_bizevents bizevents 35 days Business events
default_spans spans 10 days Distributed traces
dt_system_events system 1 year Audit, billing (fixed)
Designing Your Bucket Strategy
Bucket Name Retention Purpose Who Queries It
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ
hot_troubleshooting 7 days Active incident response Ops teams (high frequency)
standard_operations 35 days Day-to-day monitoring All teams
analytics_reporting 90 days Monthly/quarterly reports Managers, BI
security_compliance 365 days Audit trail, compliance Security team, auditors
debug_verbose 3 days Debug logs (high volume) Developers (rarely)
Design Rules
- Keep frequently-queried buckets around 2-3 TB daily retained volume
- Use
default_logsas playground โ don't put production workloads there - Route high-volume debug logs to short-retention or "no storage" (drop)
- Compliance data in dedicated bucket with long retention + restricted access
- Max ~80 buckets per environment (supports ~5 TB/day per table)
Routing Data to Buckets (OpenPipeline)
OpenPipeline routing rules decide which bucket receives each record:
// Route by severity
Route 1: loglevel == "ERROR" OR loglevel == "WARN" โ hot_troubleshooting (7d)
Route 2: loglevel == "INFO" โ standard_operations (35d)
Route 3: loglevel == "DEBUG" โ debug_verbose (3d) OR no_storage (drop)
// Route by team (using K8s namespace)
Route 1: k8s.namespace.name == "payments" โ payments_bucket (35d)
Route 2: k8s.namespace.name == "platform" โ platform_bucket (35d)
Default: โ default_logs (35d)
// Route by compliance requirement
Route 1: matchesPhrase(content, "audit") โ security_compliance (365d)
Route 2: matchesPhrase(content, "transaction") โ compliance_bucket (365d)
Default: โ standard_operations (35d)
โ ๏ธ Routing is evaluated BEFORE processing. Fields added during processing (like enriched attributes) CANNOT be used in routing conditions. Plan your routing based on fields that exist at ingest time.
Terraform: Create Buckets
# Note: bucket creation is via Settings API or UI
# Terraform resource for OpenPipeline routing:
resource "dynatrace_openpipeline_v2_logs_pipelines" "routing" {
name = "Team-Based Log Routing"
enabled = true
routing {
match_condition = "k8s.namespace.name == \"payments\""
pipeline_id = "payments_pipeline"
}
pipeline {
id = "payments_pipeline"
enabled = true
storage_stage {
bucket_assignment = "payments_logs"
}
}
}
Querying Specific Buckets
// Query all logs (across all buckets)
fetch logs, from:now()-1h
// Query specific bucket only (faster, cheaper)
fetch logs, from:now()-1h
| filter dt.system.bucket == "hot_troubleshooting"
// See which buckets have data
fetch dt.system.buckets
| filter table == "logs"
| fields bucket_name, record_count, size_bytes
Bucket Access Control
Combine buckets with ABAC for team-level data isolation:
// Policy: team can only query their bucket
ALLOW storage:logs:read
WHERE storage:bucket == "payments_logs"
// Or combine with security context:
ALLOW storage:logs:read
WHERE storage:dt.security_context MATCH ("SV-PAYMENTS")
AND storage:bucket == "payments_logs"
Cost Optimization with Buckets
Strategy Savings
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Route debug logs to 3-day bucket ~80% less retention cost for debug data
Drop verbose health checks 100% savings (no_storage)
Short retention for hot troubleshoot ~80% less than 35-day default
Dedicated compliance bucket Only pay for long retention where needed
"Retain with Included Queries" Model
Buckets have two retention tiers:
- Included Queries retention (10-35 days): included query volume = retained_GiB ร 15/day
- Overall retention (up to 10 years): usage-based query billing beyond included
Architect predictable dashboard costs for recent data; push older retention to usage-based.
Migration: From "One Pool" to Buckets
Step Action Result
โโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 Analyze current log volume by source Know what you're routing
2 Identify retention requirements Compliance vs operational vs debug
3 Create buckets (UI or API) Storage containers ready
4 Configure OpenPipeline routing rules Data flows to correct buckets
5 Verify with DQL Check dt.system.bucket field
6 Update ABAC policies if needed Scope access to specific buckets
7 Monitor costs per bucket Validate savings
๐ Knowledge Check
Q: Can you change a bucket's retention after creation?
A: Yes, but shortening retention triggers deletion of data beyond the new period โ this can take DAYS to complete. Lengthening retention is instant.
Q: If you query fetch logs without a bucket filter, what happens?
A: It queries ALL log buckets. This is by design โ tables abstract across buckets. Add | filter dt.system.bucket == "my_bucket" to scope to a specific bucket (faster and cheaper).
Q: A customer wants 365-day retention for audit logs but 7-day for debug logs. How?
A: Create two buckets: security_compliance (365d) and debug_verbose (7d). Configure OpenPipeline routing: match audit-related logs โ compliance bucket, match debug logs โ debug bucket. Default โ standard 35-day bucket.