Skip to content

docs: add PPL language reference with data-grounded examples#142

Closed
anirudha wants to merge 6 commits intoopensearch-project:mainfrom
anirudha:main
Closed

docs: add PPL language reference with data-grounded examples#142
anirudha wants to merge 6 commits intoopensearch-project:mainfrom
anirudha:main

Conversation

@anirudha
Copy link
Copy Markdown
Collaborator

Summary

Add comprehensive PPL (Piped Processing Language) documentation section to the Observability Stack docs, targeting Splunk SREs evaluating PPL as a query language for OpenSearch observability.

  • 27 detailed per-command reference pages with consistent structure
  • All examples use real OTel data from logs-otel-v1* and otel-v1-apm-span-* indices - no fabricated data
  • Every example verified against local OpenSearch PPL API and includes a playground link
  • PPL overview page positioning it as the native query language (with KQL/EQL comparison)
  • Function reference covering 200+ built-in functions across 13 categories
  • Masterclass pipeline examples showcasing PPL's full power for SRE workflows

New pages

Page Description
ppl/index.md PPL overview - why PPL, comparison table, getting started
ppl/commands.md Command reference summary (50+ commands)
ppl/commands/*.md 27 individual command pages with full detail
ppl/functions.md Function reference (aggregation, string, datetime, math, etc.)
ppl/examples.md Real-world OTel queries with playground links

Per-command pages

Search & Filter: search, where
Fields & Transformation: fields, eval, rename, fillnull, expand, flatten
Aggregation & Statistics: stats, eventstats, streamstats, timechart, trendline
Sorting & Limiting: sort, head, dedup, top, rare
Text Extraction: parse, grok, rex, patterns, spath
Data Combination: join, lookup
Machine Learning: ml
Metadata: describe

Each command page follows a consistent structure:

  1. Description - what it does, when to use it
  2. Syntax - full syntax block
  3. Arguments - required/optional table with defaults
  4. Usage notes - behavioral notes, gotchas, performance tips
  5. Basic examples (3-5) - with playground links
  6. Extended examples (1-2) - OTel observability scenarios
  7. See also - cross-references to related commands

Examples page highlights

  • SRE incident response: error rate over time, first error per service, P95 latency timeseries
  • Trace analysis: slowest traces, error spans, latency percentiles, trace fan-out
  • AI agent observability: token usage, cost analysis, tool execution, agent invocation latency
  • Advanced analytics: eventstats outlier detection, streamstats rolling windows, trendline smoothing
  • Masterclass pipelines: service health scorecard, GenAI cost/perf analysis, Envoy access log parsing, ML-based error pattern discovery, cross-signal log-trace correlation

Other changes

  • Sidebar reordered: Overview → Get Started → Send Data → PPL → Discover → ...
  • Updated main docs index and investigate page with PPL links
  • README updated with PPL section

Data grounding

All text extraction examples (grok, rex, parse, spath) were tested against actual log bodies in the cluster:

  • Envoy access logs from frontend-proxy: [timestamp] "METHOD /path HTTP/1.1" status ...
  • Kafka broker logs: [ComponentName id=N] message ...
  • Load generator logs: User action product: ID

Key PPL behavioral findings documented:

  • parse requires full-string match (implicitly anchored); rex does partial matching
  • Java regex named capture groups cannot contain underscores (camelCase only)
  • Grok patterns with multiple unnamed %{DATA} cause "Duplicate key" errors

Test plan

  • npm run build passes with all internal links validated (starlight-links-validator)
  • All playground URLs use correct RISON encoding (!%27 for single quotes)
  • grok/rex/parse/spath patterns verified against real OTel data via local PPL API
  • No fabricated data (my-index, accounts, Apache CLF) remains in any example
  • All See Also links point to correct specific command pages
  • Visual review of each page in browser
  • Verify playground links open correctly with pre-filled queries

🤖 Generated with Claude Code

anirudha and others added 6 commits March 27, 2026 07:55
Add a full PPL (Piped Processing Language) documentation section to the
Observability Stack docs, positioning PPL as the native query language
for logs and traces.

New pages:
- PPL overview with comparison to KQL and EQL
- Command reference summary (50+ commands)
- 27 detailed per-command reference pages with Description, Syntax,
  Arguments, Usage notes, Basic/Extended examples, and See also
- Function reference (200+ functions across 13 categories)
- Observability examples with live playground links for OTel data

Commands documented individually: search, where, fields, eval, rename,
fillnull, expand, flatten, stats, eventstats, streamstats, timechart,
trendline, sort, head, dedup, top, rare, parse, grok, rex, patterns,
spath, join, lookup, ml, describe

Updated:
- Sidebar with categorized PPL command navigation
- Main docs index with PPL section and LinkCard
- Investigate page with links to new PPL reference
- README with PPL section and example query

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace every generic example (accounts, gender, age, etc.) across all
27 PPL command pages with real observability data from logs-otel-v1* and
otel-v1-apm-span-* indices. All queries validated against the local
OpenSearch PPL API endpoint.

Also remove duplicate PPL Commands/Functions entries from the Reference
sidebar section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Escape single quotes in RISON encoding (%27 → !%27) to prevent
  premature termination of query strings containing PPL literals
- Widen time range from now-15m to now-6h so playground shows data

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reorder sidebar: Overview, Get Started, Send Data, PPL, Discover,
Agent Observability, Application Monitoring, Dashboards & Visualize,
Alerting, Agent Health, SDKs/MCP & Clients, Claude Code.

Rename "Alerting & Detection" to "Alerting" and "Reference" to
"SDKs, MCP & Clients". Replace all em dashes with hyphens across
71 doc files. Fix anchor links broken by the em dash removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rewrite grok/rex/parse/spath examples with verified patterns from real
  OTel data (Envoy access logs, Kafka broker logs) instead of fabricated
  Apache log patterns
- Fix expand/flatten docs to use OTel indices instead of fabricated my-index
- Add Data Prepper flat schema notes to expand/flatten
- Fix timechart trace example with timefield=startTime
- Fix head.md dedup+head example that was missing dedup
- Fix search.md operator precedence note (PPL OR>AND differs from SQL)
- Add stats earliest()/latest() example, where BETWEEN example
- Fix 22 broken See Also links across 17 command docs to point to
  specific command pages instead of generic index
- Add masterclass pipeline examples to examples.md (service health
  scorecard, GenAI cost analysis, Envoy log parsing, error pattern
  discovery, cross-signal log-trace correlation)
- Add advanced analytics section (eventstats, streamstats, trendline)
- Generate 28 new playground URLs for all added examples
- Build validates with all internal links valid

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
docs: add PPL language reference with data-grounded examples
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 18.51%. Comparing base (5d6beb0) to head (21c5483).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #142   +/-   ##
=======================================
  Coverage   18.51%   18.51%           
=======================================
  Files           3        3           
  Lines          54       54           
  Branches       18       19    +1     
=======================================
  Hits           10       10           
  Misses         44       44           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@anirudha
Copy link
Copy Markdown
Collaborator Author

Closing in favor of a new PR with squashed commit and DCO sign-off.

@anirudha anirudha closed this Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant