Skip to content

test: broaden leaf coverage — escape/identifier units + 2.1.169 golden snapshot#6

Open
dividedby wants to merge 2 commits into
skrabe:mainfrom
dividedby:test/broaden-leaf-coverage
Open

test: broaden leaf coverage — escape/identifier units + 2.1.169 golden snapshot#6
dividedby wants to merge 2 commits into
skrabe:mainfrom
dividedby:test/broaden-leaf-coverage

Conversation

@dividedby

Copy link
Copy Markdown

Tests-only coverage broadening (control-plane Piebald-AI#8leaf test broadening). No patch/extraction logic touched; merge is yours.

What

Two new vitest files closing the highest-value gaps a version bump could slip through:

src/systemPromptCustomization.test.ts — direct units for the previously-untested exported helpers carrying the extract → map → apply customization pipeline:

  • buildRegexFromPieces — regex-metacharacter and ${VAR}-literal escaping (pieces that look like regex / interpolation must match literally)
  • extractUserCustomizations + buildHumanToRealMapping — round-trip recovery, conflict-throw, and unmapped-identifier skip paths
  • reconstructContentFromPieces — human-name interleave + UNKNOWN_<idx> fallback
  • extractOriginalWhitespace / applyOriginalWhitespace — boundary-whitespace preservation

src/tests/promptsGolden.test.ts (+ snapshot) — golden digest over the pinned in-matrix prompts-2.1.169.json (the extractor's committed output): prompt count, unique-id sha, identifier-value surface, plus the duplicate-id groups and unmapped-identifier prompts. A bump that silently drops, renames, or re-shapes a prompt flips the snapshot in review.

Scope notes

  • Tests only. The deep escape internals (escapeNonAsciiForRegex, hexAlt, buildSearchRegexFromPieces, applyIdentifierMapping) are not exported, so they're out of scope here — covering them would need an export decision from you.
  • The golden captures, rather than asserts away, two pre-existing extractor quirks in 2.1.169, surfaced as candidate findings (not fixed here):
    • 4 duplicate-id groups (system-reminder-cross-session-peer-message-authority-warning appears 4×).
    • 3 skill-code-review-phase-* prompts whose identifier index has no identifierMap entry → reconstruct as UNKNOWN_<idx>.

pnpm test green (343 passed, +17), pnpm lint clean.

🤖 Generated with Claude Code

Adds direct unit coverage for the previously-untested exported helpers that
carry the extract → map → apply prompt-customization pipeline, plus a golden
digest locking the extractor's output for the pinned in-matrix version.

- systemPromptCustomization.test.ts: buildRegexFromPieces regex-metachar /
  ${VAR}-literal escaping, extractUserCustomizations + buildHumanToRealMapping
  round-trip and conflict/skip paths, reconstructContentFromPieces, and the
  whitespace extract/apply pair.
- promptsGolden.test.ts (+ snapshot): pins prompts-2.1.169.json structural
  shape and a digest (prompt count, unique-id sha, duplicate-id groups,
  unmapped-identifier prompts, identifier-value surface) so a version bump
  cannot silently drop, rename, or re-shape a prompt without flipping review.

Tests only; no patch/extraction logic changed. The snapshot captures two
pre-existing extractor quirks (4 duplicate-id groups; 3 skill-code-review
prompts whose identifier index has no identifierMap entry) as candidate
findings rather than asserting them away.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
dividedby added a commit to dividedby/tweakcc-maint that referenced this pull request Jun 9, 2026
…zed covered (#68)

Inventory posted + tweakcc-fixed test PR prepared (skrabe/tweakcc-fixed#6:
escape/identifier units + 2.1.169 golden). Lobotomized per-override coverage
resolved as covered by tools/auditMisbinds.mjs — no parallel harness (ADR 0001).
Row → Blocked, owner human, ready-for-human (awaiting skrabe merge of #6).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
@dividedby dividedby marked this pull request as draft June 9, 2026 06:58
@dividedby

Copy link
Copy Markdown
Author

Marking this draft — it's an optional contribution and I'd rather check it maps onto how you want tests to look before it's a merge ask.

Two independent pieces, different value/cost:

  1. systemPromptCustomization.test.ts — plain unit tests for the exported customization helpers (buildRegexFromPieces escaping, extractUserCustomizations/buildHumanToRealMapping round-trip, whitespace pair). No version pinning, nothing to maintain on a bump. This is the part I'd actually stand behind.

  2. promptsGolden.test.ts (+ snapshot) — a digest lock over prompts-2.1.169.json. Honest tradeoff: it's pinned to one version, so it becomes your maintenance — on the next bump you'd bump PINNED_VERSION and regenerate the snapshot. And it surfaces two quirks in the current data (4 duplicate-id groups; 3 skill-code-review-* prompts whose identifier index has no identifierMap entry → reconstruct as UNKNOWN_<idx>) that you may already know about or have reasons for. Given you've got auditMisbinds + the capture/read-consistency guard, a golden snapshot may just not be the shape you want.

So: do you want test contributions like this at all, and if so in what form? Happy to drop the golden and keep only the unit helpers, rework it to track latest instead of a pinned version, or close this and leave testing to your own gates. Your call — no pressure to merge.

@skrabe

skrabe commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Thanks — and good instinct marking it draft. Where I land:

Taking systemPromptCustomization.test.ts. Those exported helpers (buildRegexFromPieces, extractUserCustomizations/buildHumanToRealMapping, reconstructContentFromPieces, the whitespace pair) carry the extract→map→apply pipeline and had no direct coverage — a silent regression there corrupts every customized prompt on a bump. Version-independent, zero maintenance. Exactly the shape I want.

Dropping the golden + snapshot. Three reasons, none about the quality of the work:

  • It duplicates a gate I already run. A silently dropped/renamed/reshaped prompt is already caught by the showtime no-regression check (named-count can't drop, has to stay ahead of Piebald) + auditMisbinds + the capture/read-consistency tests. A pinned vitest mirror is redundant.
  • Pinning to 2.1.169 means I regenerate the snapshot every bump — and since I regenerate it as part of the bump, its only real catch-window is accidental drift between bumps, which is narrow. Net it's recurring friction + stale-red risk for little marginal protection.
  • It captures the two quirks rather than root-causing them. I'd rather not green-light the 4× duplicate-id by snapshotting it — I actually watched it bite on an apply (the system-reminder-cross-session-peer-message-authority-warning override flip-flopped 2.1.167 → 2.1.169 / 2.1.169 → 2.1.167 in the sync step). That's a candidate extractor finding I want to fix, not lock in. Genuinely useful that your snapshot surfaced it — thanks for that.

So: drop promptsGolden.test.ts + the snapshot, keep systemPromptCustomization.test.ts, un-draft, and I'll merge. Appreciate the contribution.

Per skrabe's review on
skrabe#6 — the pinned golden
duplicates the showtime no-regression check plus auditMisbinds plus the
capture/read-consistency tests, and costs a regen every bump for a narrow
catch-window. Keep systemPromptCustomization.test.ts (version-independent
helper coverage).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dividedby dividedby marked this pull request as ready for review June 9, 2026 18:16
@dividedby

Copy link
Copy Markdown
Author

Done — dropped promptsGolden.test.ts + the snapshot (e835b4f), kept systemPromptCustomization.test.ts for the version-independent helper coverage. Suite green (340 passed), lint clean. PR is un-drafted and ready to merge.

Your reasoning on the golden is right — it duplicates gates you already run and pins you to a regen every bump for a narrow catch-window. The helper units are the part worth keeping.

On the flip-flop the snapshot surfaced: agreed it's an extractor finding to root-cause, not lock in — leaving it with you. For the record, what we saw was the system-reminder-cross-session-peer-message-authority-warning override oscillating 2.1.167 → 2.1.169 / 2.1.169 → 2.1.167 across the sync step, alongside the 4× duplicate id. Happy to help reproduce if useful. Thanks for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants