Skip to content

docs: update eval results — context retention 3.9→4.6, +77% vs tail-window#435

Merged
BYK merged 1 commit into
mainfrom
docs-eval-results
May 21, 2026
Merged

docs: update eval results — context retention 3.9→4.6, +77% vs tail-window#435
BYK merged 1 commit into
mainfrom
docs-eval-results

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 20, 2026

Updates marketing copy with the latest eval results from the recall quality + distillation transparency work (#428, #430, #431, #432, #433, #434).

README.md

  • Context retention table: Medium 2.3→4.1, Hard 3.3→4.8, Average 3.9→4.6
  • Lore vs tail-window delta: +50%→+77%
  • Added footnote: Lore scores averaged across multiple runs; TW/compaction baselines from a prior eval run with the same scenarios
  • Added v6 to version history

docs/index.html

  • Hero stat: +50%→+77% vs tail-window
  • Detail retention: 4.8→4.6 (overall average across difficulty levels, multiple runs)

Review corrections

  • Fixed Medium from 4.3→4.1 (honest multi-run average, not cherry-picked)
  • Average row (4.6) now self-consistent with column values: (5.0+4.1+4.8)/3=4.63≈4.6
  • Added footnote clarifying that TW/compaction columns are from a prior eval run

@BYK BYK self-assigned this May 20, 2026
@BYK BYK force-pushed the docs-eval-results branch from 9c91a5c to 1b89d12 Compare May 20, 2026 23:31
@BYK BYK merged commit 93de73a into main May 21, 2026
7 checks passed
@BYK BYK deleted the docs-eval-results branch May 21, 2026 06:06
This was referenced May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant