📌 Description
OpenTelemetry trace-context propagation and Prometheus metrics exist, but there is
no single guide explaining the observability stack: what traces and metrics are
emitted, how trace context flows from Express through the Horizon listener, how to
read the dashboards, and how to correlate a trace with logs and metrics during an
incident. docs/operations-metrics.md lists gauges but not the end-to-end story.
This issue authors an observability operations guide.
🎯 Requirements and Context
- Document the trace-propagation path (HTTP → middleware → listener → jobs) and
the correlation-ID-to-trace-id relationship.
- Catalogue exported metrics and their intended SLO/alert use.
- Document how to correlate traces, structured logs, and metrics for a given
request during triage.
- Cross-link the on-call SLO runbook.
🛠️ Suggested Execution
1. Fork the repo and create a branch
git checkout -b docs/observability-guide
2. Implement changes
- Author
docs/observability.md, cross-linking docs/operations-metrics.md and
docs/runbooks/on-call-slo.md.
- Add
src/tests/docs.observability.test.ts asserting documented metric names
match those registered in code.
3. Test and commit
- Run:
npm test -- src/tests/docs.observability.test.ts
- Cover edge cases: documented metric names exist in the registry, required
sections present, runbook cross-link valid.
Example commit message
docs: observability and tracing operations guide
✅ Guidelines
- Minimum 95% test coverage on any assertion code added.
- Documented metric names must match the registry.
- Timeframe: 96 hours.
🏷️ Labels
type-documentation · area-backend · MAYBE REWARDED · GRANTFOX OSS · OFFICIAL CAMPAIGN
💬 Community & Support
- Join the contributor Discord to coordinate, ask questions, and get unblocked fast: https://discord.gg/xvNAvMJf
- Please introduce yourself in the channel before you start so we can avoid duplicate work, pair you with a reviewer, and get your PR merged quickly.
- Maintainers actively triage this channel and aim for fast, clear, respectful reviews — reach out any time you're blocked.
📌 Description
OpenTelemetry trace-context propagation and Prometheus metrics exist, but there is
no single guide explaining the observability stack: what traces and metrics are
emitted, how trace context flows from Express through the Horizon listener, how to
read the dashboards, and how to correlate a trace with logs and metrics during an
incident.
docs/operations-metrics.mdlists gauges but not the end-to-end story.This issue authors an observability operations guide.
🎯 Requirements and Context
the correlation-ID-to-trace-id relationship.
request during triage.
🛠️ Suggested Execution
1. Fork the repo and create a branch
2. Implement changes
docs/observability.md, cross-linkingdocs/operations-metrics.mdanddocs/runbooks/on-call-slo.md.src/tests/docs.observability.test.tsasserting documented metric namesmatch those registered in code.
3. Test and commit
npm test -- src/tests/docs.observability.test.tssections present, runbook cross-link valid.
Example commit message
✅ Guidelines
🏷️ Labels
type-documentation·area-backend·MAYBE REWARDED·GRANTFOX OSS·OFFICIAL CAMPAIGN💬 Community & Support