Context
Today, running /crawl-sim https://example.com twice produces two independent reports. The second run has no idea the first run existed. Users have no way to answer:
- "Did the refactor we just shipped actually improve our GPTBot score?"
- "Which findings regressed since last week?"
- "Is our llms.txt score trending up?"
Claude's memory system can't be the answer here because:
- It's per-user and per-session
- It doesn't survive a fresh Claude Code install, a machine change, or a memory wipe
- It's not shareable with teammates
- It's not available at all on other agent harnesses (Codex, Gemini, custom runners)
The answer needs to live in the file system, next to the audited project, independent of any particular agent platform.
Principle
Every audit writes a snapshot. Every audit compares against the previous snapshot for the same URL. No agent memory required.
Proposal
File layout
<cwd>/.crawl-sim/
├── snapshots/
│ ├── example_com.json # latest snapshot for this URL
│ └── example_com.history.jsonl # append-only log of all snapshots
└── README.md # auto-generated, explains the dir
The directory lives in the user's working directory, not the crawl-sim install location, so each project gets its own snapshot history.
Snapshot schema
Each snapshot is the current compute-score.sh output plus two metadata fields:
{
"url": "https://example.com",
"timestamp": "2026-04-11T22:41:00Z",
"version": "0.1.0",
"gitCommit": "bb62a6d (optional — recorded if the cwd is a git repo)",
"overall": { "score": 88, "grade": "A-" },
"bots": { ... },
"categories": { ... },
"findings": [ ... ]
}
URL → filename mapping: replace / with _, drop scheme, drop query string. https://example.com/blog/post?ref=x → example_com_blog_post.json.
Diffing
When a new audit runs and a previous snapshot exists for the same URL:
- Load the old snapshot
- Compute deltas per bot, per category, and overall
- Match findings by title/category to detect resolved vs persisting vs new issues
- Add a "since last run" block to the output
Example output addition:
Overall: 88/100 (A-) [▲ +6 since 2026-04-04 (17 days ago)]
Per-bot change:
Googlebot 95 A [▲ +2]
GPTBot 82 B+ [▲ +12] ← biggest improvement
ClaudeBot 82 B+ [▲ +12]
PerplexityBot 79 B [—]
Per-category change:
Accessibility 96 A [—]
Content Visibility 81 B+ [▲ +14] ← likely fix: SSR article cards
Structured Data 92 A [▼ -7] ← regression: BreadcrumbList removed
Technical Signals 90 A [—]
AI Readiness 65 C [▲ +15] ← new llms.txt file
Resolved since last run: 2 findings
Persisting: 3 findings
New: 1 finding (Structured Data regression)
After showing the diff, append the new snapshot to both the latest file and the .history.jsonl log.
History file
example_com.history.jsonl is append-only. One JSON object per line, one per run. Useful for:
- Future trend charts
- Debugging "when did this regress?"
- Generating sparklines in reports
Unlimited retention for now; add pruning (e.g., keep last 100) if the file gets unwieldy.
New script
scripts/compare-snapshots.sh <current.json> <previous.json> — outputs a JSON diff object that the skill can include in the narrative. Pure function, easy to test.
Orchestration changes
SKILL.md step 9 ("Interpret and produce output") grows a pre-step:
Step 8.5: Diff against previous snapshot
- Look for .crawl-sim/snapshots/<url>.json in the cwd
- If it exists, load it and run compare-snapshots.sh
- Pass the diff to the interpretation step
- At the end, write the new snapshot to the same location and append to history.jsonl
Flags
/crawl-sim <url> --no-snapshot — skip writing the new snapshot (useful for exploratory runs)
/crawl-sim <url> --no-diff — skip the comparison step
/crawl-sim <url> --baseline — force overwrite the existing snapshot as the new reference point (resets deltas to zero)
Future: git-tracking
Not scoping this now, but noting for later: an opt-in mode where .crawl-sim/snapshots/ is committed to the user's repo so every PR's audit is version-controlled. This enables:
- PR checks that fail the build on score regressions (
jq -e '.overall.score >= 70')
- Historical review via
git log -p .crawl-sim/snapshots/
- Team-shareable baselines without a central database
A follow-up issue can add a --git flag that ensures .crawl-sim/ is tracked, adds a suggested .gitignore exemption, and outputs a suggested CI step. For now, file-based local snapshots are enough — leave it out of this issue.
Acceptance criteria
Out of scope
- Git-tracking automation (future issue)
- Trend charts / visualizations (future, once history has enough data)
- Cloud sync of snapshots (never — the whole point is local-first and agent-agnostic)
- Pruning old history (add when it becomes a real problem)
Why this is the best answer
- Works identically across agent platforms (Claude Code, Codex, custom runners)
- Survives memory wipes, machine changes, and reinstalls
- Shareable with teams via the user's existing version control
- Enables CI integration with no new infrastructure
- Keeps the "scripts as deterministic evidence" contract — the snapshot is just another script output
Context
Today, running
/crawl-sim https://example.comtwice produces two independent reports. The second run has no idea the first run existed. Users have no way to answer:Claude's memory system can't be the answer here because:
The answer needs to live in the file system, next to the audited project, independent of any particular agent platform.
Principle
Proposal
File layout
The directory lives in the user's working directory, not the crawl-sim install location, so each project gets its own snapshot history.
Snapshot schema
Each snapshot is the current
compute-score.shoutput plus two metadata fields:{ "url": "https://example.com", "timestamp": "2026-04-11T22:41:00Z", "version": "0.1.0", "gitCommit": "bb62a6d (optional — recorded if the cwd is a git repo)", "overall": { "score": 88, "grade": "A-" }, "bots": { ... }, "categories": { ... }, "findings": [ ... ] }URL → filename mapping: replace
/with_, drop scheme, drop query string.https://example.com/blog/post?ref=x→example_com_blog_post.json.Diffing
When a new audit runs and a previous snapshot exists for the same URL:
Example output addition:
After showing the diff, append the new snapshot to both the latest file and the
.history.jsonllog.History file
example_com.history.jsonlis append-only. One JSON object per line, one per run. Useful for:Unlimited retention for now; add pruning (e.g., keep last 100) if the file gets unwieldy.
New script
scripts/compare-snapshots.sh <current.json> <previous.json>— outputs a JSON diff object that the skill can include in the narrative. Pure function, easy to test.Orchestration changes
SKILL.md step 9 ("Interpret and produce output") grows a pre-step:
Flags
/crawl-sim <url> --no-snapshot— skip writing the new snapshot (useful for exploratory runs)/crawl-sim <url> --no-diff— skip the comparison step/crawl-sim <url> --baseline— force overwrite the existing snapshot as the new reference point (resets deltas to zero)Future: git-tracking
Not scoping this now, but noting for later: an opt-in mode where
.crawl-sim/snapshots/is committed to the user's repo so every PR's audit is version-controlled. This enables:jq -e '.overall.score >= 70')git log -p .crawl-sim/snapshots/A follow-up issue can add a
--gitflag that ensures.crawl-sim/is tracked, adds a suggested.gitignoreexemption, and outputs a suggested CI step. For now, file-based local snapshots are enough — leave it out of this issue.Acceptance criteria
scripts/compare-snapshots.sh <current.json> <previous.json>exists, outputs a structured diff JSONcompute-score.shoutput is stable enough to diff cleanly across runs (no timestamps / random IDs in the comparable fields)<cwd>/.crawl-sim/snapshots/.history.jsonlis append-only and never rewritten--no-snapshot,--no-diff,--baseline.crawl-sim/README.mdon first use to explain the directoryOut of scope
Why this is the best answer