docs: add PPL language reference with data-grounded examples by anirudha · Pull Request #142 · opensearch-project/observability-stack

anirudha · 2026-03-28T04:22:49Z

Summary

Add comprehensive PPL (Piped Processing Language) documentation section to the Observability Stack docs, targeting Splunk SREs evaluating PPL as a query language for OpenSearch observability.

27 detailed per-command reference pages with consistent structure
All examples use real OTel data from logs-otel-v1* and otel-v1-apm-span-* indices - no fabricated data
Every example verified against local OpenSearch PPL API and includes a playground link
PPL overview page positioning it as the native query language (with KQL/EQL comparison)
Function reference covering 200+ built-in functions across 13 categories
Masterclass pipeline examples showcasing PPL's full power for SRE workflows

New pages

Page	Description
`ppl/index.md`	PPL overview - why PPL, comparison table, getting started
`ppl/commands.md`	Command reference summary (50+ commands)
`ppl/commands/*.md`	27 individual command pages with full detail
`ppl/functions.md`	Function reference (aggregation, string, datetime, math, etc.)
`ppl/examples.md`	Real-world OTel queries with playground links

Per-command pages

Search & Filter: search, where
Fields & Transformation: fields, eval, rename, fillnull, expand, flatten
Aggregation & Statistics: stats, eventstats, streamstats, timechart, trendline
Sorting & Limiting: sort, head, dedup, top, rare
Text Extraction: parse, grok, rex, patterns, spath
Data Combination: join, lookup
Machine Learning: ml
Metadata: describe

Each command page follows a consistent structure:

Description - what it does, when to use it
Syntax - full syntax block
Arguments - required/optional table with defaults
Usage notes - behavioral notes, gotchas, performance tips
Basic examples (3-5) - with playground links
Extended examples (1-2) - OTel observability scenarios
See also - cross-references to related commands

Examples page highlights

SRE incident response: error rate over time, first error per service, P95 latency timeseries
Trace analysis: slowest traces, error spans, latency percentiles, trace fan-out
AI agent observability: token usage, cost analysis, tool execution, agent invocation latency
Advanced analytics: eventstats outlier detection, streamstats rolling windows, trendline smoothing
Masterclass pipelines: service health scorecard, GenAI cost/perf analysis, Envoy access log parsing, ML-based error pattern discovery, cross-signal log-trace correlation

Other changes

Sidebar reordered: Overview → Get Started → Send Data → PPL → Discover → ...
Updated main docs index and investigate page with PPL links
README updated with PPL section

Data grounding

All text extraction examples (grok, rex, parse, spath) were tested against actual log bodies in the cluster:

Envoy access logs from frontend-proxy: [timestamp] "METHOD /path HTTP/1.1" status ...
Kafka broker logs: [ComponentName id=N] message ...
Load generator logs: User action product: ID

Key PPL behavioral findings documented:

parse requires full-string match (implicitly anchored); rex does partial matching
Java regex named capture groups cannot contain underscores (camelCase only)
Grok patterns with multiple unnamed %{DATA} cause "Duplicate key" errors

Test plan

npm run build passes with all internal links validated (starlight-links-validator)
All playground URLs use correct RISON encoding (!%27 for single quotes)
grok/rex/parse/spath patterns verified against real OTel data via local PPL API
No fabricated data (my-index, accounts, Apache CLF) remains in any example
All See Also links point to correct specific command pages
Visual review of each page in browser
Verify playground links open correctly with pre-filled queries

🤖 Generated with Claude Code

Add a full PPL (Piped Processing Language) documentation section to the Observability Stack docs, positioning PPL as the native query language for logs and traces. New pages: - PPL overview with comparison to KQL and EQL - Command reference summary (50+ commands) - 27 detailed per-command reference pages with Description, Syntax, Arguments, Usage notes, Basic/Extended examples, and See also - Function reference (200+ functions across 13 categories) - Observability examples with live playground links for OTel data Commands documented individually: search, where, fields, eval, rename, fillnull, expand, flatten, stats, eventstats, streamstats, timechart, trendline, sort, head, dedup, top, rare, parse, grok, rex, patterns, spath, join, lookup, ml, describe Updated: - Sidebar with categorized PPL command navigation - Main docs index with PPL section and LinkCard - Investigate page with links to new PPL reference - README with PPL section and example query Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace every generic example (accounts, gender, age, etc.) across all 27 PPL command pages with real observability data from logs-otel-v1* and otel-v1-apm-span-* indices. All queries validated against the local OpenSearch PPL API endpoint. Also remove duplicate PPL Commands/Functions entries from the Reference sidebar section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Escape single quotes in RISON encoding (%27 → !%27) to prevent premature termination of query strings containing PPL literals - Widen time range from now-15m to now-6h so playground shows data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Reorder sidebar: Overview, Get Started, Send Data, PPL, Discover, Agent Observability, Application Monitoring, Dashboards & Visualize, Alerting, Agent Health, SDKs/MCP & Clients, Claude Code. Rename "Alerting & Detection" to "Alerting" and "Reference" to "SDKs, MCP & Clients". Replace all em dashes with hyphens across 71 doc files. Fix anchor links broken by the em dash removal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Rewrite grok/rex/parse/spath examples with verified patterns from real OTel data (Envoy access logs, Kafka broker logs) instead of fabricated Apache log patterns - Fix expand/flatten docs to use OTel indices instead of fabricated my-index - Add Data Prepper flat schema notes to expand/flatten - Fix timechart trace example with timefield=startTime - Fix head.md dedup+head example that was missing dedup - Fix search.md operator precedence note (PPL OR>AND differs from SQL) - Add stats earliest()/latest() example, where BETWEEN example - Fix 22 broken See Also links across 17 command docs to point to specific command pages instead of generic index - Add masterclass pipeline examples to examples.md (service health scorecard, GenAI cost analysis, Envoy log parsing, error pattern discovery, cross-signal log-trace correlation) - Add advanced analytics section (eventstats, streamstats, trendline) - Generate 28 new playground URLs for all added examples - Build validates with all internal links valid Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add PPL language reference with data-grounded examples

codecov · 2026-03-28T04:24:07Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 18.51%. Comparing base (5d6beb0) to head (21c5483).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #142   +/-   ##
=======================================
  Coverage   18.51%   18.51%           
=======================================
  Files           3        3           
  Lines          54       54           
  Branches       18       19    +1     
=======================================
  Hits           10       10           
  Misses         44       44

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

anirudha · 2026-03-28T04:24:12Z

Closing in favor of a new PR with squashed commit and DCO sign-off.

anirudha and others added 6 commits March 27, 2026 07:55

Merge pull request #1 from anirudha/ppl-language-docs

21c5483

docs: add PPL language reference with data-grounded examples

anirudha requested review from goyamegh, kylehounslow, ps48 and vamsimanohar as code owners March 28, 2026 04:22

anirudha closed this Mar 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add PPL language reference with data-grounded examples#142

docs: add PPL language reference with data-grounded examples#142
anirudha wants to merge 6 commits intoopensearch-project:mainfrom
anirudha:main

anirudha commented Mar 28, 2026

Uh oh!

codecov bot commented Mar 28, 2026

Uh oh!

anirudha commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anirudha commented Mar 28, 2026

Summary

New pages

Per-command pages

Examples page highlights

Other changes

Data grounding

Test plan

Uh oh!

codecov bot commented Mar 28, 2026

Codecov Report

Uh oh!

anirudha commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant