Follow-up from docs/issues/2026-04-12-accuracy-from-live-audit.md (R3) and docs/plans/2026-04-18-live-audit-follow-through.md.
Problem
crawl-sim now computes parity, but the user-facing presentation still has a lingering ambiguity: when all bots effectively see the same thing, redundant per-bot rows add noise and imply differentiation where there is none.
Goal
Make parity-first presentation the default when bot deltas are below a meaningful threshold, while preserving the full per-bot breakdown when divergence is real.
Scope
- Define the collapse threshold for user-facing output/report rendering
- likely something like
max(score) - min(score) <= 5 and near-identical visibility/parity metrics
- Collapse the scorecard/report into a single summary line when that threshold is met
- Keep the expanded per-bot table when divergence is meaningful
- Ensure HTML/PDF/report renderers and SKILL examples stay consistent with the new behavior
Acceptance criteria
Why this matters
The live audit's “four rows of 94/A” were technically true but semantically noisy. Parity should be legible at a glance.
Follow-up from
docs/issues/2026-04-12-accuracy-from-live-audit.md(R3) anddocs/plans/2026-04-18-live-audit-follow-through.md.Problem
crawl-sim now computes parity, but the user-facing presentation still has a lingering ambiguity: when all bots effectively see the same thing, redundant per-bot rows add noise and imply differentiation where there is none.
Goal
Make parity-first presentation the default when bot deltas are below a meaningful threshold, while preserving the full per-bot breakdown when divergence is real.
Scope
max(score) - min(score) <= 5and near-identical visibility/parity metricsAcceptance criteria
Why this matters
The live audit's “four rows of 94/A” were technically true but semantically noisy. Parity should be legible at a glance.