Homeโ€บ๐ŸŽฏ Phase 0: Understand the Shiftโ€บModule 25 min read ยท 3/21

Grail Architecture Deep Dive

Tutorial

Grail โ€” Where All Data Lives After Migration

After migration, all observability data lives in Grail โ€” Dynatrace's unified data lakehouse. In Gen2, metrics, logs, and traces were in separate stores. In Gen3, everything goes into Grail and is queryable with DQL. Understanding Grail is essential before migrating dashboards and alerts.

Grail Data Flow OneAgent OpenTelemetry APIs (Events, Logs) Cloud Integrations OpenPipeline Parse ยท Filter ยท Route Extract ยท Transform Grail Metrics Logs Traces Events Entities BizEvents Dashboards Notebooks Workflows Alerting DQL โ€” query any data from any consumer Data Sources Processing Storage Consumers

Buckets: How Data is Organized

Grail organizes data into buckets. Each bucket has its own retention policy and access controls.

Bucket                    What's In It              Default Retention
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
default_logs              Application & system logs  35 days (configurable 1dโ€“10y)
default_metrics           All metric data            15 months (1-min granularity)
default_events            Davis events, problems     35 days
default_bizevents         Business events            35 days
default_spans             Distributed traces         10 days (configurable 10dโ€“10y)

๐Ÿ’ก You can create custom buckets with different retention. Example: a compliance_logs bucket with 5-year retention for audit logs, while regular logs keep 35 days.

OpenPipeline: The Processing Layer

Before data reaches Grail, it passes through OpenPipeline. This is where you:

  • Parse โ€” extract structured fields from raw log lines
  • Filter โ€” drop noisy or irrelevant data before storage
  • Route โ€” send data to different buckets based on rules
  • Extract metrics โ€” create metrics from log patterns
  • Transform โ€” enrich data with additional fields

๐Ÿ›  Try it: In your Dynatrace environment, go to Settings โ†’ Log Monitoring โ†’ Log processing (or search "OpenPipeline" in the app launcher). You'll see the default pipeline with its processing stages.

Retention: Configurable Per Bucket

Data Type          Default Retention  Configurable Range
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Logs               35 days            1 day โ€“ 10 years (+1 week)
Spans (traces)     10 days            10 days โ€“ 10 years (+1 week)
Metrics            15 months          Fixed (1-min granularity)
Events             35 days            Configurable
Business events    35 days            Configurable
Security (builtin) 3 years            Fixed
Security (3rd pty) 1 year             Fixed

You configure retention per bucket. This means you can keep security logs for years while keeping debug logs for only a few days โ€” optimizing cost without losing compliance data.

Access Control at the Data Level

Grail supports fine-grained access control:

Level          What It Controls                    Example
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Bucket         Who can query which bucket           Team A โ†’ only compliance_logs
Table          Who can query which data type        Devs โ†’ logs + traces, not billing
Record         Filter rows by attribute             Only see data where team == "yours"
Field          Mask or hide specific columns        Hide PII fields from non-admins

โš ๏ธ This is a major upgrade from Gen2 management zones, which could only filter by entity. Grail permissions filter by data attributes โ€” much more powerful.

Key Takeaways

  • Grail = one store for all observability data (metrics, logs, traces, events, entities)
  • OpenPipeline = processing layer between ingestion and storage
  • Buckets = data containers with per-bucket retention and access control
  • DQL = the single query language that works across all data in Grail

Under the Hood: Why Grail is Different

Schema-on-Read (No Upfront Schema)

Traditional data warehouses require you to define schemas before storing data. Grail uses schema-on-read โ€” data is stored as-is, and you define the structure when you query it. This means:

  • No schema migrations when data formats change
  • No data loss from schema mismatches
  • Full flexibility to explore any data at any time

Datawarping (Patented Technology)

Datawarping is Dynatrace's patented storage and retrieval technology. Instead of building indexes to find data (which consume 90-99% overhead), Datawarping inverts the problem โ€” it identifies where data is not stored, which is 250x faster than traditional indexing. This eliminates:

  • Manual index management
  • Hot/cold storage tiers and rehydration
  • The trade-off between query speed and storage cost

Massively Parallel Processing (MPP)

Every DQL query is automatically split across thousands of parallel nodes. Each node processes its data segment independently, then results are merged. This means:

  • Query performance scales linearly with data volume
  • No single point of failure (fault-tolerant by design)
  • Data never leaves the environment scope (security isolation)

๐Ÿ’ก Grail combines the cost efficiency of a data lake with the query performance of a data warehouse โ€” without indexes, without manual tiering, without schema management. That's why it's called a "data lakehouse".

Built-in Security Buckets

Bucket                              What's In It                  Retention
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
default_securityevents_builtin      Dynatrace-generated security  3 years
default_securityevents              Third-party security events   1 year

๐Ÿ’ก Discover all buckets: fetch dt.system.buckets | fields name, table, retentionDays

Custom Buckets

Create custom buckets for different retention or compliance needs:

  • Compliance bucket โ€” 7 years retention for audit logs
  • Short-term bucket โ€” 7 days for debug logs (save cost)
  • Team bucket โ€” separate storage per team with Grail permissions

Route data to custom buckets using OpenPipeline โ€” Dynatrace's data processing engine that handles ingestion, transformation, and routing at scale.

๐Ÿ”ง Migration Step: Plan Your Data Retention

Before migrating, map your current retention to Grail buckets:

Step  Action                                  Command / UI
โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
1     Check current retention settings         Settings โ†’ Data privacy โ†’ Retention
2     Identify compliance requirements         Which data needs 1yr+ retention?
3     Plan custom buckets                      compliance_logs (5yr), debug_logs (7d)
4     Create buckets in Grail                  Settings โ†’ Grail โ†’ Buckets โ†’ + Bucket
5     Set up OpenPipeline routing              Settings โ†’ Log Monitoring โ†’ OpenPipeline
6     Verify data flows to correct buckets     fetch dt.system.buckets | fields name, retentionDays

๐Ÿ’ก Cost optimization: In Gen2, all logs had the same retention. In Gen3, you can route debug logs to a 7-day bucket and audit logs to a 5-year bucket. This alone can reduce storage costs by 30-50%.