fix: context health note — signal omitted details instead of false assurance by BYK · Pull Request #434 · BYK/loreai

BYK · 2026-05-20T23:10:59Z

Summary

Updates the context health note injected at gradient layer 1+ to honestly signal that distilled observations are lossy summaries with details omitted, rather than falsely assuring the model that details are preserved.

Problem

The context health note said:

Earlier details (file paths, error messages, decisions) are preserved in distilled observations and searchable via the recall tool.

This gave the model false confidence. When it saw a compressed distillation mentioning "flock locking" but not the rejected "staleness check" alternative, it answered "no alternatives were considered" without invoking recall — because the note said details were preserved.

Fix

Updated both variants of the context health note to:

Distilled observations above are lossy summaries — specific details (exact error messages, rejected alternatives, file paths, numerical values) are likely omitted. Use recall to verify any specific claim before answering questions about what happened, what was considered, or what the exact values were.

Key changes:

"preserved" → "lossy summaries" — honest about compression loss
Lists specific detail types that are likely omitted (rejected alternatives, exact error messages)
"Use recall proactively" → "Use recall to verify any specific claim" — stronger guidance

Eval Results (CM-1, 400K inflation)

Two runs: 4.62 and 4.69 (average ~4.65, up from 3.69 baseline).

m3 (the persistent failure) scored 5.0 on one run when the distillation observer preserved the alternative, and 2.2 on the other when it didn't. The remaining variance is in distillation quality (non-deterministic LLM output), not in the context health note.

Tests

1752 pass, 0 fail
Typecheck clean across all 4 packages

…surance The context health note previously said 'details are preserved in distilled observations' — which is inaccurate when observations are heavily compressed. The model trusts this and doesn't invoke recall, missing specific details like rejected alternatives and exact error messages. Updated to explicitly state that distilled observations are lossy summaries with specific details likely omitted, and that recall should be used to verify claims before answering questions about specifics. Eval score: 4.62-4.69 at 400K inflation (avg ~4.65, variance-dependent).

…indow (#435) Updates marketing copy with the latest eval results from the recall quality + distillation transparency work (#428, #430, #431, #432, #433, #434). ### README.md - Context retention table: Medium 2.3→4.1, Hard 3.3→4.8, Average 3.9→4.6 - Lore vs tail-window delta: +50%→+77% - Added footnote: Lore scores averaged across multiple runs; TW/compaction baselines from a prior eval run with the same scenarios - Added v6 to version history ### docs/index.html - Hero stat: +50%→+77% vs tail-window - Detail retention: 4.8→4.6 (overall average across difficulty levels, multiple runs) ### Review corrections - Fixed Medium from 4.3→4.1 (honest multi-run average, not cherry-picked) - Average row (4.6) now self-consistent with column values: (5.0+4.1+4.8)/3=4.63≈4.6 - Added footnote clarifying that TW/compaction columns are from a prior eval run

BYK self-assigned this May 20, 2026

BYK merged commit d2c59bb into main May 20, 2026
10 checks passed

BYK deleted the fix-context-health-note branch May 20, 2026 23:19

BYK mentioned this pull request May 20, 2026

docs: update eval results — context retention 3.9→4.6, +77% vs tail-window #435

Merged

This was referenced May 21, 2026

publish: BYK/loreai@0.23.0 #439

Closed

publish: BYK/loreai@0.23.0 #448

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: context health note — signal omitted details instead of false assurance#434

fix: context health note — signal omitted details instead of false assurance#434
BYK merged 1 commit into
mainfrom
fix-context-health-note

BYK commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 20, 2026

Summary

Problem

Fix

Eval Results (CM-1, 400K inflation)

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant