release: live-scoring v3 + anc 0.4.0 + post-#91 promotion#115
Merged
Conversation
First production cut of the live-scoring stack. Promotes everything on dev since PR #91 (the prior release): plan U5-U10 worker code (handler, sandbox DO, container, R2 cache, rate limits, telemetry, kill switch, homepage form, shareable result URLs, monitoring runbook), the docker sandbox image with anc v0.4.0 baked in, deploy split + routing-drift follow-ups (#92 CI token plumbing), the contributor surface (nav, footer Source row, intake template, README rewrite), the build SRP refactor, the SEO JSON-LD @graph fix, the cross-migration rollback rehearsal evidence and recipe correction, and the env.SCORE handler guard that converts a mid-rollback CF 1101 into a typed 503 sandbox_unavailable. Lockstep image bump: both top-level containers[0].image and env.staging.containers[0].image at :9aed5c3 (anc-cli v0.4.0, sha256:dae72c56afe2f332e8745c0517f1ed5d21993470de663409dfc9b3973cdfe4c1). The image cleared staging deploy on dev push 26384622721; soak was skipped per release-cut decision. Triple-diff verification clean (134 files changed; no guarded-path leaks; expected B-diff is the prod-pin bump only).
cd06021 to
bdd4ff9
Compare
Merged
3 tasks
brettdavies
added a commit
that referenced
this pull request
May 25, 2026
…post-#115 follow-up) (#116) ## Summary Two related changes in two commits. **Commit 1, enable the production homepage form after fail-closed verification.** PR #115 deliberately omitted the production `TURNSTILE_SITEKEY` value from `wrangler.jsonc`, with the express purpose of forcing the live-scoring form to fail closed in production immediately after the v3 cut deployed: an empty `sitekey` causes `src/client/live-score.ts` to disable the form with a "Live scoring is available on staging only" notice rather than render a non-functional Turnstile widget. The point of that deliberate omission was to verify, on live anc.dev, that the fail-closed path actually works end-to-end before flipping the switch. It did: the form was disabled, the disabled-state message rendered, the surface degraded gracefully without a single 5xx. With fail-closed verified in production, this commit now enables the form by wiring the real production `sitekey` into the top-level `vars` block. The matching `TURNSTILE_SECRET` was set on the production Worker via `wrangler secret put TURNSTILE_SECRET` separately (encrypted at rest in CF; never committed). The `sitekey` is public-by-design: Turnstile embeds it in HTML at request time for the client-side widget to render. Anyone viewing the page source already sees it; committing it to `wrangler.jsonc` is intentional. The secret is what enforces ownership server-side at the `siteverify` API boundary, and it is not in the repo. **Commit 2, trim wrangler.jsonc comments per `/code-comments` policy + stash rationale.** Eight comment blocks audited against the WHY-only / no-temporal / no-task-flow / no-local-doc-ref / no-instructional policy. Net 65 lines removed from `wrangler.jsonc`, 41 lines added to `RELEASES-RATIONALE.md` under a new `## Wrangler env inheritance traps` section that consolidates which keys inherit from top-level (`routes`, `triggers`, `vars`), which do not (`containers`, `durable_objects`, `migrations`, etc.), the REPLACE-not-merge semantics on `vars`, and the 2026-04-30 routing-drift incident as historical context. Each trimmed comment in `wrangler.jsonc` keeps the WHY that's unique to inline context and points at the consolidated rationale for the rest. Procedural runbook content that was duplicated between `wrangler.jsonc` and `RELEASES.md` § Sandbox image releases is dropped from the config in favor of the pointer. ## Changelog ### Added - Enable the live-scoring homepage form on anc.dev by wiring the production `TURNSTILE_SITEKEY`. PR #115's intentional omission served its purpose: the fail-closed path was verified in production (form disabled with the staging-only notice, no 5xx, graceful degradation). With fail-closed proven, the real `sitekey` is now in place and the form is live. ### Changed - Consolidate Wrangler env-inheritance rationale (`routes` / `triggers` / `vars` inheritance semantics, the 2026-04-30 routing-drift incident, container app naming quirk) into a new `RELEASES-RATIONALE.md` § Wrangler env inheritance traps. `wrangler.jsonc` comments now point at the consolidated rationale. ### Documentation - Trim eight `wrangler.jsonc` comment blocks per the `/code-comments` policy: drop temporal phrasing (`first-ever`, `since 2026-04-30`, etc.), drop `docs/plans/` and personal-repo references, drop instructional voice (`Flip via:`, `Deploy with:`, `Removing this line will reintroduce the bug`), drop runbook content that duplicates `RELEASES.md` § Sandbox image releases. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) PR #115 shipped working live-scoring infrastructure behind a deliberate fail-closed gate. Commit 1 flips the gate open after that fail-closed behavior was verified in production. Commit 2 is documentation hygiene that surfaced during a `/code-comments` audit of the config file we were already editing. ## Related Issues/Stories - Story: Closes the deferred-promotion step from PR #115 (live-scoring v3 launch). PR #115's empty `TURNSTILE_SITEKEY` was a designed-in safety gate so the v3 cut would fail closed if anything about Turnstile wiring were misconfigured. Manual visit to anc.dev right after #115 deployed confirmed the fail-closed UX: form disabled, staging-only notice rendered, no error pages. Commit 1 enables the form now that fail-closed is proven. Documentation cleanup is opportunistic, triggered by being in the file already. - Issue: n/a - Architecture: n/a - Related PRs: #115 (the deliberate fail-closed cut this enables), #114 (sandbox image bump to anc 0.4.0 that #115 promoted). ## Testing - [x] Unit tests added/updated - [x] All tests passing **Test Summary:** - `bun test`: 737 pass / 0 fail across 28 files. - `bun x wrangler deploy --dry-run`: clean. Lists `env.TURNSTILE_SITEKEY ("ff0x4AAAAAADQFMBoVm56-OPuQ")` (the real production `sitekey`). - `bun x wrangler deploy --dry-run --env staging`: clean. Lists `env.TURNSTILE_SITEKEY ("1x00000000000000000000AA")` (Cloudflare's always-pass test `sitekey`, unchanged). Confirms env.staging.vars REPLACE semantics correctly isolates staging from the new top-level value. - Pre-push gate: pass (lint, build, test, both wrangler dry-runs, pack-README, banned-fonts, prose-check 0 blocking). - `/code-comments` policy pattern scan against `wrangler.jsonc`: zero temporal / zero local-doc / zero instructional findings post-trim. - `wrangler secret list` against the production Worker confirms `TURNSTILE_SECRET` is set (re-set from 1Password's `agentnative-site Cloudflare Turnstile / prod.secret_key` field via `wrangler secret put` before this PR opened). - **Fail-closed verification (post-#115, pre-#116):** anc.dev homepage form rendered the disabled state with the "Live scoring is available on staging only" notice; no Turnstile widget rendered; no `/api/score` POST was dispatched; no 5xx in the request log. The deliberate omission worked exactly as designed. **Post-merge verification plan** (after the production deploy on this PR's merge): - Visit anc.dev. Confirm the homepage form is now enabled and the Turnstile widget renders (invisible-mode, so it may not be visually obvious; the form's submit button should be clickable rather than greyed out with the staging-only notice). - Submit a real input through the homepage form. Confirm round-trip: Turnstile widget executes, POST to `/api/score` returns a scorecard, browser redirects to `/live-score/<binary>` with the inline scorecard. - `curl https://anc.dev/api/score -H 'Content-Type: application/json' -d '{}' ` still returns `400 unrecognized_input` (sanity: handler unchanged by this PR). ## Files Modified **Modified:** - `wrangler.jsonc`: enabled the top-level `vars` block and populated `TURNSTILE_SITEKEY` with the production `sitekey` from 1Password (`agentnative-site Cloudflare Turnstile / prod.site_key`). Trimmed eight comment blocks per the `/code-comments` policy (see `## Summary` commit 2 for the per-block summary). Net 65 lines removed. - `RELEASES-RATIONALE.md`: new `## Wrangler env inheritance traps` section consolidates inheritance semantics, override patterns, and the historical context that the config-file comments previously held inline. **Created:** - None. **Renamed:** - None. **Deleted:** - None.
Merged
3 tasks
brettdavies
added a commit
that referenced
this pull request
May 25, 2026
…udit from release #116 (#117) ## Summary Back-port of PR #116 (which shipped to production via main) onto dev so the next release branch carries the production Turnstile wiring and the consolidated `wrangler.jsonc` comment hygiene by default, rather than re-introducing the fail-closed gap or the duplicated runbook prose. Two commits cherry-picked from the original `release/2026-05-25-prod-turnstile-wiring` branch (deleted on PR #116 merge), preserving the same diff that already shipped to production: - **`feat(worker)` c08a0fa, mirror prod TURNSTILE_SITEKEY wiring from release #116.** Enables the top-level `vars` block in `wrangler.jsonc` and populates `TURNSTILE_SITEKEY` with the production `sitekey` from 1Password (`agentnative-site Cloudflare Turnstile / prod.site_key`). The matching `TURNSTILE_SECRET` lives in CF-encrypted wrangler secrets, not committed. env.staging.vars is unchanged: it carries Cloudflare's always-pass test `sitekey` (`1x...AA`), which under wrangler `vars` REPLACE semantics correctly isolates staging from the new top-level value. Without this back-port, the next release branch snapshot of dev would re-introduce the empty `TURNSTILE_SITEKEY` and re-create the fail-closed gap on production. - **`docs(wrangler)` bac003d, trim comments per `/code-comments` policy + stash rationale.** Eight comment blocks audited against the WHY-only / no-temporal / no-task-flow / no-local-doc-ref / no-instructional policy. Net 65 lines removed from `wrangler.jsonc`, 41 lines added to `RELEASES-RATIONALE.md` under a new `## Wrangler env inheritance traps` section that consolidates which keys inherit from top-level (`routes`, `triggers`, `vars`), which do not, the REPLACE-not-merge semantics on `vars`, and the 2026-04-30 routing-drift incident as historical context. Each trimmed comment in `wrangler.jsonc` keeps the WHY that's unique to inline context and points at the consolidated rationale for the rest. ## Changelog ### Added - Mirror the production `TURNSTILE_SITEKEY` wiring from release #116 onto dev so the next release-branch snapshot carries it. No behavior change on dev (env.staging keeps the test `sitekey` for CLI verification flows under `vars` REPLACE semantics). ### Changed - Consolidate Wrangler env-inheritance rationale (`routes` / `triggers` / `vars` inheritance semantics, the 2026-04-30 routing-drift incident, container app naming quirk) into `RELEASES-RATIONALE.md` § Wrangler env inheritance traps. `wrangler.jsonc` comments now point at the consolidated rationale. ### Documentation - Trim eight `wrangler.jsonc` comment blocks per the `/code-comments` policy: drop temporal phrasing (`first-ever`, `since 2026-04-30`, etc.), drop `docs/plans/` and personal-repo references, drop instructional voice (`Flip via:`, `Deploy with:`, `Removing this line will reintroduce the bug`), drop runbook content that duplicates `RELEASES.md` § Sandbox image releases. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) The `feat` headlines because the next release-branch cut now ships the working homepage form on production by default. Documentation cleanup rides along. ## Related Issues/Stories - Story: Back-port of PR #116 onto dev to prevent the next release-branch snapshot from re-introducing the empty-`sitekey` fail-closed state on production. PR #116 already shipped to anc.dev via squash-merge to main; this PR aligns dev with main on the relevant files. - Issue: n/a - Architecture: n/a - Related PRs: #116 (the production release this back-ports), #115 (the live-scoring v3 launch that ran behind the deliberate fail-closed gate), #114 (sandbox image bump to anc 0.4.0). ## Testing - [x] Unit tests added/updated - [x] All tests passing **Test Summary:** - `bun test`: 737 pass / 0 fail across 28 files. - `bun x wrangler deploy --dry-run`: clean. Lists `env.TURNSTILE_SITEKEY ("ff0x4AAAAAADQFMBoVm56-OPuQ")` (the real production `sitekey`). - `bun x wrangler deploy --dry-run --env staging`: clean. Lists `env.TURNSTILE_SITEKEY ("1x00000000000000000000AA")` (Cloudflare's always-pass test `sitekey`, unchanged). Confirms env.staging.vars REPLACE semantics correctly isolates staging from the new top-level value. - Pre-push gate: pass (lint, build, test, both wrangler dry-runs, pack-README, banned-fonts, prose-check). **Post-merge verification plan** (after the staging deploy on this PR's merge): - The staging Worker on `agentnative-site-staging.<subdomain>.workers.dev` continues to render its homepage form via the test `sitekey`; nothing on staging changes because env.staging.vars REPLACE wins. - The next release-branch cut from this dev tip carries `TURNSTILE_SITEKEY: ff0x...` in `wrangler.jsonc` top-level by default. No further hot-fix needed to keep the production form enabled across future cuts. ## Files Modified **Modified:** - `wrangler.jsonc`: enabled the top-level `vars` block with the production `sitekey`; trimmed eight comment blocks per `/code-comments` policy. Net 65 lines removed. - `RELEASES-RATIONALE.md`: new `## Wrangler env inheritance traps` section consolidates inheritance semantics, override patterns, and the historical context that the config-file comments previously held inline. **Created:** - None. **Renamed:** - None. **Deleted:** - None.
3 tasks
brettdavies
added a commit
that referenced
this pull request
May 25, 2026
…Turnstile 400020) (#118) ## Summary Fast follow to PR #116. The production `TURNSTILE_SITEKEY` value pinned in `wrangler.jsonc` was `ff0x4AAAAAADQFMBoVm56-OPuQ` (26 chars), which the Cloudflare Turnstile widget rejects with error 400020 (Invalid `sitekey`) on every request. Cloudflare API confirmed the dashboard's actual `sitekey` for this account is `0x4AAAAAADQFMBoVm56-OPuQ` (24 chars, starting with `0x` per Turnstile's `sitekey` convention). The 1Password item `agentnative-site Cloudflare Turnstile / prod.site_key` had a stray `ff` prefix from a paste error when the credential was first saved; PR #116 dutifully read that wrong value into `wrangler.jsonc`. The 1Password field has been corrected at source and this PR ships the corrected pin to production. User-visible state after PR #116 deployed: - Homepage form was enabled (no longer in the fail-closed "Live scoring is available on staging only" state). - Turnstile widget tried to challenge `https://challenges.cloudflare.com/...<key-in-path>...` with the pinned `ff0x4...` value, got HTTP 400 back, console showed Turnstile error 400020. - Browser console additionally showed "Call to execute() on a widget that is already executing" because the client retries the challenge on its own (separate client-side bug; will be filed as its own PR). - No `/api/score` POST ever reached the Worker because the widget never produced a token; no rate-limit or kill-switch impact; no telemetry events written. env.staging is unchanged: it carries Cloudflare's always-pass test `sitekey` (`1x00000000000000000000AA`), independent of the 1Password value. ## Changelog ### Fixed - Correct the production `TURNSTILE_SITEKEY` value in `wrangler.jsonc` from `ff0x4AAAAAADQFMBoVm56-OPuQ` (paste-error value carried into PR #116 from a stale 1Password entry) to `0x4AAAAAADQFMBoVm56-OPuQ` (the actual dashboard `sitekey` for this account). Cloudflare Turnstile error 400020 on the anc.dev homepage form clears with this deploy. ## Type of Change - [x] `fix`: Bug fix (non-breaking change which fixes an issue) ## Related Issues/Stories - Story: PR #116 enabled the homepage form on production after fail-closed verification, but the pinned `TURNSTILE_SITEKEY` value had a stray `ff` prefix from a 1Password paste error. This corrects the value at source (1Password) and ships the corrected pin. - Issue: n/a - Architecture: n/a - Related PRs: #116 (the enable-the-form release whose 1Password value was wrong), #115 (the live-scoring v3 launch this restores to working order). ## Testing - [x] Unit tests added/updated - [x] All tests passing **Test Summary:** - `bun test`: 737 pass / 0 fail across 28 files. - `bun x wrangler deploy --dry-run`: clean. Lists `env.TURNSTILE_SITEKEY ("0x4AAAAAADQFMBoVm56-OPuQ")` (the corrected production `sitekey`). - `bun x wrangler deploy --dry-run --env staging`: clean. Lists `env.TURNSTILE_SITEKEY ("1x00000000000000000000AA")` (Cloudflare's always-pass test `sitekey`, unchanged). - Pre-push gate: pass (lint, build, test, both wrangler dry-runs, pack-README, banned-fonts, prose-check). - Cloudflare API: `GET /accounts/<id>/challenges/widgets` failed with auth-scope error on the existing `CF_API_TOKEN`, so the dashboard `sitekey` was retrieved manually and re-staged into 1Password via the no-echo `stage_secret.sh` pipeline. 1Password's `prod.site_key` field is now 24 chars matching the dashboard value. **Post-merge verification plan** (after the production deploy on this PR's merge): - View source on anc.dev and confirm the meta tag `turnstile-sitekey` carries `content="0x4AAAAAADQFMBoVm56-OPuQ"` (with `0x4` prefix, no `ff`). - Open anc.dev in a browser. Submit a real input (e.g., `ripgrep`). Confirm the Turnstile challenge succeeds (invisible-mode, no widget popup), POST to `/api/score` returns a scorecard, browser redirects to `/live-score/<binary>` with the inline scorecard. - Confirm browser console no longer shows Turnstile error 400020. - Separately track: the "Call to execute() on a widget that is already executing" warning is a client-side bug in `src/client/live-score.ts:acquireTurnstileToken` (always-render + always-execute pattern, needs reset()-before-execute) and the `static.cloudflareinsights.com/beacon.min.js` CSP violation is the CF Web Analytics auto-injected beacon hitting a CSP that doesn't list `static.cloudflareinsights.com`. Both are out of scope here. ## Files Modified **Modified:** - `wrangler.jsonc`: top-level `vars.TURNSTILE_SITEKEY` changed from `ff0x4AAAAAADQFMBoVm56-OPuQ` to `0x4AAAAAADQFMBoVm56-OPuQ` (drop stray `ff` prefix). **Created:** - None. **Renamed:** - None. **Deleted:** - None.
3 tasks
brettdavies
added a commit
that referenced
this pull request
May 25, 2026
…119) ## Summary `acquireTurnstileToken` called `api.render()` on every acquire, so the second form submit produced "Call to execute() on a widget that is already executing" in the console plus Turnstile error 400020 from `challenges.cloudflare.com`. Render exactly once per page session, then `reset()` + `execute()` on the existing widget id for retries. Surfaced after #118 corrected the production `sitekey` and the form started producing real challenges. ## Changelog ### Fixed - `/api/score` form on anc.dev succeeds on retry submits. Pre-fix, only the first submit per page could produce a Turnstile token; subsequent submits silently failed with Turnstile error 400020. ## Type of Change - [x] `fix`: Bug fix (non-breaking change which fixes an issue) ## Related Issues/Stories - Story: Surfaced during browser-side QA after PR #118 corrected the production `TURNSTILE_SITEKEY`. - Issue: n/a - Architecture: n/a - Related PRs: #118, #116, #115. ## Testing - [x] Unit tests added/updated - [x] All tests passing **Test Summary:** - `bun test`: 737 pass / 0 fail. No client-side unit test added: no DOM harness in Bun test today; regression test pinned to follow-up `feat(test-infra)` PR (plan in flight). - Manual: on staging after merge, submit twice in the same page session; both should round-trip with zero Turnstile console warnings. ## Files Modified **Modified:** - `src/client/live-score.ts`: render Turnstile once; reset+execute the existing widget id on subsequent acquires. Module-scope `turnstileWidget` + `pendingTurnstile` slot with a `settleTurnstile` helper rotating the resolver per acquire. **Created:** - None. **Renamed:** - None. **Deleted:** - None.
Merged
3 tasks
brettdavies
added a commit
that referenced
this pull request
May 25, 2026
…ide widget teardown (#120) ## Summary Three fixes that surfaced sequentially while validating PR #119's reset+execute fix on staging. Validated end-to-end at staging Worker version `c6ab5306-b238-4e07-b41b-472858261c15`. **Commit 1 (`3ff7b7a`), drop invalid `size: 'invisible'` + add `execution: 'execute'`.** `acquireTurnstileToken` passed `size: 'invisible'` to `api.render()`. Per CF docs, `size` accepts `compact | flexible | normal` only; `invisible` throws `Uncaught TurnstileError` and puts the widget in a stuck "executing" state that masks the #119 reset+execute fix. Invisible behavior is `sitekey`-mode (set in CF dashboard), not a render-time argument. Drop the invalid value; the off-screen container CSS keeps the widget visually hidden. Add `execution: 'execute'` so the challenge defers to our explicit `api.execute()` instead of starting on render. **Commit 2 (`1ff0b01`), allow Lato in CSP + tear down widget on `pagehide`.** Even with the `sitekey` configured as `Invisible` mode, Turnstile's bootstrap injects `<link rel=stylesheet href="https://fonts.googleapis.com/css?family=Lato...">` into the host document (defensive UI prep). Our CSP blocks it. Allowlist `https://fonts.googleapis.com` on `style-src` and `https://fonts.gstatic.com` on `font-src` in both `CSP_HTML` (`src/worker/headers.ts`) and `LIVE_SCORE_CSP` (`src/worker/score/summary-render.ts`). Separately, on `pagehide`, call `api.remove(widgetId)` and clear module-scope state so a `bfcache` restore can't re-bootstrap a half-dead widget and re-inject the Lato stylesheet. ## Changelog ### Fixed - `/api/score` homepage form on anc.dev now executes Turnstile cleanly: zero `Uncaught TurnstileError`, zero "already executing" warnings, zero CSP violations on initial load OR `bfcache` restore (back-button to homepage from a result page). ### Changed - CSP allowlist for `https://fonts.googleapis.com` on `style-src` and `https://fonts.gstatic.com` on `font-src` (both `CSP_HTML` and `LIVE_SCORE_CSP`). Required for Turnstile's defensive Lato bootstrap even on Invisible-mode `sitekeys`. - TurnstileApi type tightened to match docs: `size` is now `'compact' | 'flexible' | 'normal'`; new optional `execution` field. ## Type of Change - [x] `fix`: Bug fix (non-breaking change which fixes an issue) ## Related Issues/Stories - Story: Surfaced sequentially during browser-side validation of PR #119. Each fix unmasked the next: #119 fixed the render-twice pattern, exposing the `size:'invisible'` invalid value; fixing that unmasked the Lato CSP gap and the `bfcache` stale-widget. All three now clean. - Issue: n/a - Architecture: n/a - Related PRs: #119 (reset+execute, now actually reachable), #118, #116, #115. ## Testing - [x] Unit tests added/updated - [x] All tests passing **Test Summary:** - `bun test`: 737 pass / 0 fail. - `bun run lint`: clean. - `bun run build`: clean. `dist/js/live-score.js` bundle contains `execution:'execute'` and the `pagehide` handler; no `size:'invisible'`. - Manual on staging at Worker version `c6ab5306-b238-4e07-b41b-472858261c15` (deployed from this branch via `bun x wrangler deploy --env staging`): - First submit (`ripgrep`) round-trips, redirects to `/score/ripgrep`. Zero console warnings. - Back-button to homepage. Zero CSP violations on `bfcache` restore. - Second submit succeeds the same as the first. ## Files Modified **Modified:** - `src/client/live-score.ts`: drop `size: 'invisible'` from `api.render()`; add `execution: 'execute'`; tighten `TurnstileApi` type; add `pagehide` listener that removes the widget + clears module-scope state. - `src/worker/headers.ts`: extend `CSP_HTML` `style-src` and `font-src` to allow Google Fonts origins (Turnstile bootstrap requirement). - `src/worker/score/summary-render.ts`: same CSP extension on `LIVE_SCORE_CSP` for `/live-score/<binary>` pages. **Created:** - None. **Renamed:** - None. **Deleted:** - None.
16 tasks
brettdavies
added a commit
that referenced
this pull request
May 26, 2026
…rl without hint (#121) ## Summary Fixes a production-affecting bug where `/api/score` returned no `share_url` for any github-url paste without a curated discovery hint (e.g., `https://github.com/sharkdp/hexyl`, `https://github.com/o2sh/onefetch`). Users got a one-shot inline scorecard with no shareable URL, and `/score/live/<binary>` 404'd even though the DO had written the cache entry to R2. The handler now derives `share_url` from the discovered `spec.binary` on both the post-discovery cache_post tier and the live-success branch, keyed identically to the DO's cache write. Reproduced on staging Worker `c6ab5306-b238-4e07-b41b-472858261c15` (PR #120 branch), fix verified on staging Worker deployed from this branch via workflow_dispatch run [26426837021](https://github.com/brettdavies/agentnative-site/actions/runs/26426837021). ## Root cause The 2026-05-20 discovery-move (PR #100, U8) lifted binary resolution from the DO up to the Worker. Pre-move, the DO owned both discovery AND the scorecard run, so `share_url` could be derived at handler time only when the binary was knowable from the input alone (install-command spec or a hinted github-url). Discovery moving up to the Worker created the resolved `spec.binary` one tier earlier in the pipeline, but the share-URL derivation stayed pinned to the pre-discovery shape and never picked up the new value. The rest of the discovery-move audit surfaced no other stranded derivations. ## Changelog ### Fixed - `/api/score` now returns `share_url: "/score/live/<binary>"` for github-url pastes without a curated hint, after the live discovery resolves the binary. Previously these requests returned a scorecard inline but no shareable URL, leaving the R2 cache entry unreachable. - `share_url` now omits unshareable binaries (uppercase, underscore, period, or leading hyphen in the discovered binary). Previously these would have minted URLs the `/score/live/<binary>` route refuses to serve. ## Type of Change - [x] `fix`: Bug fix (non-breaking change which fixes an issue) - [x] `test`: Adding or updating tests - [x] `refactor`: Code refactoring (no functional changes) ## Related Issues/Stories - Story: n/a - Issue: n/a - Architecture: n/a - Related PRs: #115 (live-scoring v3 launch where the gap shipped), #120 (Turnstile fix on the Worker version where the bug was reproed on staging) ## Testing - [x] Unit tests added/updated - [x] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing **Test Summary:** - New file `tests/score-handler-share-url-post-discovery.test.ts`: 9 tests written tests-first (4 happy-path / drift-safety for github-url without hint, 4 red-team slug-shape tests, 1 branch-scoped no-op). All 4 happy-path tests failed before the fix landed; all 9 pass after. - Extended `tests/score-registry-lookup.test.ts`: 6 unit tests for the new `deriveShareBinaryFromSpec()` helper covering all InstallSpec variants, plus a regex-source equality invariant pinning that `SHARE_URL_BINARY_RE` and the route's `BINARY_SLUG_RE` stay byte-identical. - Full suite: 755/755 pass, 0 fail. - Staging dogfood: deployed via workflow_dispatch, user confirmed the share-URL UX works end-to-end on the two repro inputs (`sharkdp/hexyl`, `o2sh/onefetch`). ## Files Modified **Modified:** - `src/worker/score/registry-lookup.ts`: added `SHARE_URL_BINARY_RE` constant, `deriveShareBinaryFromSpec(spec)` post-discovery helper, and a `safeShareBinary()` slug-shape gate. Extended `deriveShareBinary` to gate through `safeShareBinary` so pre-discovery callers also refuse unshareable binaries. - `src/worker/score/handler.ts`: post-discovery cache_post tier (step 6.5) and live-success branch now mint `share_url` via the new `shareUrlForSpec(spec)`, keyed to `spec.binary` (the same value `do.ts:writeCacheBestEffort` writes the R2 cache under). Pre-discovery cache_pre tier still uses `shareUrlForInput` because the binary is by definition derivable from input there. - `src/worker/score/summary-render.ts`: `BINARY_SLUG_RE` re-pointed at `SHARE_URL_BINARY_RE` so the route slug regex and the handler's mint-gate regex are one source of truth. The slug regex literal stays in `registry-lookup.ts`; the equality is asserted by a unit test. - `tests/score-registry-lookup.test.ts`: six new unit tests for `deriveShareBinaryFromSpec` plus one regex-source invariant test. **Created:** - `tests/score-handler-share-url-post-discovery.test.ts`: 9 tests covering the bug plus red-team slug-shape gates. **Renamed:** - None. **Deleted:** - None. ## Key Features - Eliminates the silent-fail UX for any github-url paste of a CLI not in the curated 96 hint set. The long tail of CLIs (most of them, by definition) now gets the same share-URL surface as curated tools. - Slug-shape gate at mint time prevents a future regression where a discovered binary with characters the route rejects (`MyTool`, `my_tool`, `tool.js`) ships a URL that 404s downstream. - Regex-source equality test makes the route ↔ handler invariant unfakeable at refactor time. ## Benefits - Restores the v3 live-scoring product surface to the experience the launch documented: paste a github URL, get a shareable result page. - No new dependencies, no new bindings, no migration. Same wrangler config as `dev`; safe to land independently of any infra work. ## Breaking Changes - [x] No breaking changes - [ ] Breaking changes described below: ## Deployment Notes - [x] No special deployment steps required Standard `dev`-squash-merge then production cherry-pick when the next release window opens. No env vars added or removed; no migration entries added (DO migrations stay untouched). ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible ## Additional Context Three commits in tests-first order: `6be3866` (failing tests), `6560587` (fix to green), `089a264` (refactor to collapse the duplicate slug regex).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First production cut of the live-scoring stack. Promotes every dev change since PR #91 (2026-05-15) and a lockstep sandbox image bump to anc-cli v0.4.0.
The headline is
/api/scoreon anc.dev. Users (and agents) can paste an install command or a GitHub URL into the homepage form, and the Worker resolves it through the registry-fast-path, R2 cache, or a live sandbox run inside a Cloudflare Durable Object + Container. Sandbox runs invokeanc check(v0.4.0, baked into the image at build time) against a fresh install of the user's tool and return a typed scorecard with the response triad (spec_version,site_spec_version,anc_version,checker_url). Shareable result URLs at/live-score/<binary>serve from R2 with a 7-day TTL.Ride-alongs:
@graphnow emits an Organization + Person author on every page.build.mjsSRP-split into numbered pipeline stages; CI smoke for/api/scoreextracted intoscripts/smoke-api-score.sh.sandbox_unavailableinstead of Cloudflare error 1101 when theSCOREDO binding is missing (mid-rollback safety; surfaced by the rehearsal).RELEASES.mdincludes the cross-migration rollback rehearsal evidence row and a recipe correction (mandatorywrangler containers delete <id>step between v2-drop-sandbox and v3-restore-sandbox).Image lockstep: top-level
containers[0].imageandenv.staging.containers[0].imageboth pinned at:9aed5c3(anc-cli v0.4.0, digestsha256:dae72c56afe2f332e8745c0517f1ed5d21993470de663409dfc9b3973cdfe4c1). The image cleared the staging deploy on dev push run26384622721; soak was skipped per the release-cut decision documented in the PR thread that produced #114.Changelog
Added
/api/score: paste an install command or GitHub URL, get a typed scorecard back. Homepage form + shareable/live-score/<binary>result URLs (R2-backed, 7-day TTL).scoring_disabledkill switch in KV.brettdavies/agentnative-*repos, intake template at.github/ISSUE_TEMPLATE/, rewritten README.docs/runbooks/live-scoring-analytics.md.Changed
wrangler.jsonccontainers[0].image(production pin) advances from:30f61f1to:9aed5c3lockstep withenv.staging.containers[0].image. New image carries anc-cli v0.4.0 (was v0.3.1) plus the PR feat(score): U8 homepage live-scoring form + shareable result URLs + discovery hardening #100 Dockerfile pip-version-check suppression baked in (previously runtime-bridged).@graphwith Organization + Person author (was bareTechArticle).src/build/;build.mjsbecomes the orchestrator.stagingenv block inwrangler.jsonccarries an explicitroutes: []andtriggers.crons: []override and the full live-scoring binding surface (Sandbox DO, Container, R2, SCORE_KV, Analytics Engine, two rate-limit namespace IDs).Fixed
/api/scoreno longer surfaces Cloudflare error 1101 when theSCOREDurableObject binding is missing (mid-rollback Worker state). Returns a typed 503sandbox_unavailablewithspec_versionandchecker_urlso clients can render a useful error.Documentation
RELEASES.mdcross-migration rollback rehearsal section now documents the mandatorywrangler containers delete <id>step between v2-drop-sandbox and v3-restore-sandbox (surfaced live during the 2026-05-24 rehearsal).Type of Change
feat: New feature (non-breaking change which adds functionality)This release is multi-typed (feat headline plus several fix and docs ride-alongs) but
featheadlines because live scoring is the durable new product surface anc.dev ships with this cut.Related Issues/Stories
RELEASES.md§ Cross-migration rollback rehearsal evidence row dated 2026-05-24) and the lockstep image bump via PR feat(sandbox): bump anc to v0.4.0 + rebuild image :9aed5c3 for soak #114.docs/solutions/tooling-decisions/cloudflare-sandbox-python-3.12-base-2026-05-19.md(dev-only) for the python:3.12 base + version-pin matrix the sandbox image follows./api/scoreU5), fix(worker): add fetch() to Sandbox DO stub #94 (DO stub fetch), feat(worker): U6 Sandbox DO install + score with two-phase egress #95 (Sandbox DO + two-phase egress U6), feat(worker): U7 R2 cache + unified scorecard lookup #96 (R2 cache + unified scorecard lookup U7), fix(docs): correct wrangler r2 bucket lifecycle add syntax #97 (R2 lifecycle docs), test(worker): U7 red-team gaps + staging Turnstile-bypass docs #98 (U7 red-team tests), fix(sandbox): python:3.12 base + sdist allowlist + supply-chain delay + standard-2 #99 (python:3.12 + sdist allowlist), feat(score): U8 homepage live-scoring form + shareable result URLs + discovery hardening #100 (homepage form + share URLs U8), test(score): U9 cross-cutting tests + CI smoke + v3 release procedure #101 (U9 tests + CI smoke), refactor(ci): extract /api/score smoke into scripts/smoke-api-score.sh #102 (extract smoke script), docs(runbook): add live-scoring monitoring runbook #103 (monitoring runbook), feat(score): U10 Analytics Engine telemetry + cost guardrails + analytics runbook #104 (analytics + cost guardrails U10), feat(contrib): contributor docs + repo-wide prose pass + intake templates #105 (contrib docs + prose pass), feat(nav): expose Skill and Contribute, add footer Source row #106 (nav + footer source), refactor(build): SRP-split build.mjs + numbered pipeline stages + sitemap and prose-check fixes #107 (build SRP split), docs(readme): rewrite to reflect current dev surface #108 (README rewrite), chore: matrix sync + docker/score docs refresh #109 (matrix sync), feat(docker/score): inject mode + per-tool run-time updates #110 (docker/score mode injection), fix(seo): emit Organization + Person author in JSON-LD @graph #111 (JSON-LD @graph), docs(releases): cross-migration rollback rehearsal evidence + recipe fix #112 (rehearsal evidence + recipe fix), fix(worker): guard /api/score against missing env.SCORE binding #113 (env.SCORE handler guard), feat(sandbox): bump anc to v0.4.0 + rebuild image :9aed5c3 for soak #114 (anc v0.4.0 image rebuild).Testing
Test Summary:
bun test: 737 pass / 0 fail across 28 files on the release branch.bun x wrangler deploy --dry-runagainst both environments: clean. Listsagentnative-site-sandboxat:9aed5c3(production) andagentnative-site-staging-sandbox-stagingat:9aed5c3(staging).RELEASES.md.26384622721(post PR feat(sandbox): bump anc to v0.4.0 + rebuild image :9aed5c3 for soak #114 merge): green. Confirmed:9aed5c3deployed and/api/scoreresponds withanc_version: 0.4.0(cold-startxplrrequest returned typed 504timeoutwhile container provisioned; warm cache hit immediately after returned full scorecard).RELEASES.md: A=134 files clean, B=wrangler.jsonconly (the prod-pin bump), C=mirror of A, guarded-paths leak=clean.Post-merge verification plan (after the production deploy on this PR's merge):
agentnative-siteat:9aed5c3.curl https://anc.dev/api/score -H 'Content-Type: application/json' -d '{"input":"ripgrep","turnstile_token":"<real>"}'returns 200 with the response triad (spec_version: 0.4.0,site_spec_version: 0.4.0,anc_version: 0.4.0,checker_url).sitekey, not the staging testsitekey) and successfully POSTs.bun x wrangler containers listshowsagentnative-site-sandboxadvanced from container appa0329fb0-...(image:30f61f1, 6 instances) to a new container app instance at:9aed5c3. Existing prod instances may take a few minutes to roll over.v1(which created theSandboxclass) applies to production for the first time; the migration is irreversible. Rollback path documented inRELEASES.md§ Cross-migration rollback rehearsal, including thewrangler containers delete <id>step that the rehearsal surfaced.RELEASES.md§ Post-deploy smoke scope, the smoke step is staging-only). Manual verification within ~5 minutes of merge is the gate.Files Modified
Modified:
src/worker/,src/build/,content/,docker/sandbox/,docker/score/,tests/,scripts/,styles/,.github/,RELEASES.md,README.md,wrangler.jsonc. Full list in the diff.wrangler.jsonc: lockstep image pin bump (top-level + env.staging both:30f61f1→:9aed5c3); full live-scoring binding surface mirrored under env.staging; migration history aligned with applied state (v1 top-level; v1 + v2-drop-sandbox + v3-restore-sandbox under env.staging per the rehearsal).RELEASES.md: rehearsal recipe correction + evidence row.Created:
src/worker/score/handler.ts,src/worker/score/do.ts,src/worker/score/cache.ts,src/worker/score/discover-binary.ts,src/worker/score/parse-install.ts,src/worker/score/registry-lookup.ts,src/worker/score/resolve-spec.ts,src/worker/score/response-shape.ts,src/worker/score/sandbox-exec.ts,src/worker/score/session.ts,src/worker/score/sdist-allowlist.ts,src/worker/score/telemetry.ts,src/worker/score/turnstile.ts,src/worker/score/kill-switch.ts,src/worker/score/github-accessibility.ts, plus their test files. Full list in the diff.docker/sandbox/Dockerfile(sandbox image source),docker/sandbox/README.md.docs/runbooks/live-scoring-analytics.md.scripts/smoke-api-score.sh,scripts/staging-cache-smoke.sh.Renamed:
Deleted: