Homeโ€บ๐Ÿงช DQL Recipesโ€บModule 82 min read ยท 9/10

Advanced Functions

Reference

Advanced Functions

Beyond the basics (count, avg, sum, filter), DQL has specialized function categories for network analysis, data masking, statistics, and AI/embedding search.

Network Functions

// Check if IPs are private or public
fetch logs, from:now()-1h
| filter isNotNull(client.ip)
| fieldsAdd is_private = ipIsPrivate(client.ip),
           is_public = ipIsPublic(client.ip)
| summarize private=countIf(is_private), public=countIf(is_public)
// Filter by subnet
fetch logs, from:now()-1h
| filter ipIn(client.ip, "10.0.0.0/8")
| summarize count=count(), by:{client.ip}
// Anonymize IPs (mask last octet)
fetch logs, from:now()-1h
| fieldsAdd masked_ip = ipMask(client.ip, 24)
| fields timestamp, masked_ip, content
Function          What It Does
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
ipIsPrivate(ip)   true for RFC1918 addresses (10.x, 172.16-31.x, 192.168.x)
ipIsPublic(ip)    true for public addresses
ipIsLoopback(ip)  true for 127.0.0.0/8
ipMask(ip, bits)  mask to /bits (anonymize)
ipIn(ip, cidr)    check if IP is in subnet
isIp(field)       type check

Hash Functions

// Mask PII with irreversible hash
fetch logs, from:now()-1h
| filter contains(content, "@")
| fieldsAdd masked_email = hashSha256(content)
| fields timestamp, masked_email
Function          Use Case
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
hashSha256(str)   Irreversible masking (PII compliance)
hashMd5(str)      Fast hash (not cryptographic)
hashCrc32(str)    Fast checksum
hashXxHash64(str) Fastest (good for dedup keys)

Statistical Functions

// Multiple percentiles in one call
fetch spans, from:now()-1h
| filter span.kind == "SERVER"
| summarize p50=percentile(duration, 50),
    p90=percentile(duration, 90),
    p99=percentile(duration, 99),
    std=stddev(duration),
    med=median(duration)
// Correlation between CPU and response time
fetch spans, from:now()-2h
| filter isNotNull(process.cpu.usage) AND isNotNull(duration)
| summarize corr = correlation(process.cpu.usage, duration)
Function                    What It Calculates
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
correlation(a, b)           Pearson correlation coefficient
stddev(field)               Standard deviation
variance(field)             Variance
median(field)               Median value
percentiles(field, 50,90,99) Multiple percentiles at once
countDistinctApprox(field)  Fast approximate cardinality
countDistinctExact(field)   Exact cardinality (up to 1M)

Named Parameters (Critical Gotcha)

// WRONG โ€” positional params cause errors
fieldsAdd short = substring(content, 0, 100)
fieldsAdd result = if(x > 10, "high", "low")

// RIGHT โ€” use named params
fieldsAdd short = substring(content, from:0, to:100)
fieldsAdd result = if(x > 10, then:"high", else:"low")

Field Escaping

// Backticks for fields with hyphens or special characters
filter `error-code` == "404"
fieldsAdd `response-time-ms` = duration / 1000000

Multi-Value Filtering

// array() function for in() filter
filter in(loglevel, array("ERROR", "WARN", "SEVERE"))
โ–ถ Knowledge Check

Q: How do you anonymize an IP address to /24 in DQL?

  • โŒ mask(ip, 24)
  • โœ… ipMask(ip, 24)
  • โŒ ip.mask(24)

Q: What's wrong with substring(content, 0, 100)?

  • โœ… Must use named params: substring(content, from:0, to:100)
  • โŒ substring doesn't exist in DQL
  • โŒ The indices are wrong