test(frontend): move non-browser e2e coverage from Playwright to Vitest by mariusvniekerk · Pull Request #500 · kenn-io/middleman

mariusvniekerk · 2026-06-13T01:46:39Z

The mock-API Playwright lane paid full-browser cost — two browser projects plus a Vite dev server per run — for behavior that needs no real rendering: keyboard dispatch, focus, store-driven DOM, fetch request shapes, and URL routing. That coverage now runs in the Vitest jsdom lane that already executes on every change, mounting the real app shell (real Provider, stores, views, keyboard dispatch) with the API mocked at the fetch boundary. A roborev finding on the first commit (the jsdom mock began as a copy of the Playwright mock) is addressed in the second: both suites now consume one fixture/route core, so they cannot drift.

Changes

Converted 9 mock-suite specs (36 cases) into 37 Vitest cases: activity collapse, activity thread runs, budget display, issue routing, palette commands, palette focus trap, PR-detail palette commands, cheatsheet, keyboard shortcuts.
Folded the two tests/e2e-full duplicates (activity collapse, budget display) into the same Vitest files and deleted them.
Added src/test/appHarness.ts (mounts the real App.svelte in jsdom) and made src/test/mockApiFetch.ts the single mock-API source; tests/e2e/support/mockApi.ts is now a thin page.route adapter over the shared handler.
Specs that genuinely need a browser (screenshots, video, geometry, computed CSS, mouse drag) stay in Playwright.

Runtime

Measured back-to-back on the same machine (matched load):

Lane	What runs	Wall time
Before: Playwright (9 specs)	36 cases × chromium + firefox, incl. Vite dev-server boot	22s
After: Vitest (same coverage, 9 files)	37 cases	~4.5s

The converted cases also join the vp test lane that already runs on every change, so their marginal cost there is near zero (full suite stays ~12s wall for 2,318 tests), and CI sheds the browser provisioning and dev-server boot for this coverage entirely.

Validation: full Vitest suite 2,318 passed; full mock Playwright suite after the final edit 320 passed / 8 pre-existing skips; typecheck identical to clean-HEAD baseline.

🤖 Generated with Claude Code

^{generated by a clanker}

roborev-ci · 2026-06-13T01:53:37Z

roborev: Combined Review (`5f5bb67`)

Medium findings remain: the PR drops required full-stack e2e coverage for activity collapse and budget display.

Medium

frontend/src/App.activity-collapse.test.ts:180
The full-stack Playwright coverage for threaded activity collapse was replaced with a jsdom test that mocks /settings and /activity, so collapse/expand is no longer exercised through the real HTTP API and SQLite seed data.
Fix: Restore a minimal tests/e2e-full/activity-collapse.spec.ts that switches to threaded mode and verifies collapse/expand against the real e2e server.
frontend/src/lib/components/layout/StatusBar.budget.test.ts:81
The budget display full-stack tests were removed and replaced with mountApp plus mocked rate-limit responses, leaving no asserted e2e coverage that the real /api/v1/rate-limits data reaches the status bar/popover.
Fix: Keep a small Playwright e2e-full budget test that visits /pulls, waits for seeded rate-limit data, opens the popover, and asserts the REST/GQL/budget values.

Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 6m34s), codex_security (codex/security, done, 1m7s) | Total: 7m50s

roborev-ci · 2026-06-14T15:10:02Z

roborev: Combined Review (`f820b20`)

Mixed verdict: no security issues found, but there are medium-risk test reliability and coverage regressions to address.

Medium

frontend/src/test/appHarness.ts:116
mountApp unmounts the Svelte tree, but app startup begins sync polling and that poller is not stopped during teardown. Stale intervals can keep firing into later tests’ mocked fetch handles, or real fetch after vi.unstubAllGlobals(), which can pollute request assertions and make Vitest flaky.
Fix: Stop stores.sync.stopPolling() as part of full app teardown, or expose and stop the created store before removing globals.
frontend/src/App.activity-collapse.test.ts:1
Activity-collapse coverage was moved from real HTTP/SQLite e2e to hand-written mocked activity/settings payloads, even though the seeded full-stack server already produced this flow. This drops coverage for frontend/backend wire-shape regressions in a user-visible activity workflow.
Fix: Keep a focused e2e-full smoke test for threaded collapse/expand against the real server, and use Vitest mock tests only for extra UI-only states.
frontend/src/App.activity-collapse.test.ts:155
The compact side-pane test no longer verifies that the collapse label is hidden; it only checks that the element exists under .activity-feed--compact. A CSS regression that makes the label visible and breaks the compact layout would now pass.
Fix: Keep this assertion in a browser test using toBeHidden or computed style, since this is rendering/layout behavior.

Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 5m43s), codex_security (codex/security, done, 16s) | Total: 6m9s

The mock-API Playwright suite paid full-browser cost for behavior that needs no real rendering: keyboard dispatch, focus, store-driven DOM, fetch request shapes, and URL routing all work in jsdom. Those specs now run as Vitest component tests that mount the real App shell (real Provider, stores, and views) with the API mocked at the fetch boundary, via a new shared harness (src/test/appHarness.ts) and a fetch port of the Playwright mockApi fixtures (src/test/mockApiFetch.ts). Converted specs (deleted from tests/e2e): - activity-collapse.spec.ts -> src/App.activity-collapse.test.ts (also folds tests/e2e-full/activity-collapse.spec.ts) - activity-thread-runs.spec.ts -> src/App.activity-thread-runs.test.ts - budget-display.spec.ts -> src/lib/components/layout/StatusBar.budget.test.ts (also folds tests/e2e-full/budget-display.spec.ts) - issue-routing.spec.ts -> src/App.issue-routing.test.ts - palette-commands.spec.ts -> src/App.palette-commands.test.ts - palette-focus-trap.spec.ts -> src/App.palette-focus-trap.test.ts - palette-pr-detail-commands.spec.ts -> src/App.palette-pr-detail-commands.test.ts - cheatsheet.spec.ts -> src/App.cheatsheet.test.ts - keyboard-shortcuts-migration.spec.ts -> src/App.keyboard-shortcuts.test.ts Two intent adaptations: Enter/Space popover activation collapses into a native-button assertion because jsdom does not synthesize click from key activation, and color checks assert the inline var(--budget-red) token instead of computed rgb pixels. Validation: vp test run (2318 passed), playwright --list on the mock config collects 332 tests in 19 files; typecheck failures are identical to clean HEAD (pre-existing, none in new files). Generated with Claude Code (claude-fable-5) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The Vitest harness's mockApiFetch.ts started as a port of the Playwright suite's mockApi.ts, leaving the same fixtures and /api/v1 route matching defined twice. The two copies could drift silently (route shapes, provider identity fields, capability flags), making the suites disagree about the API contract they mock. Flagged by roborev review job 1703 on 2c43481. src/test/mockApiFetch.ts is now the single source: fixtures plus a transport-neutral createMockApiHandler, with the fetch adapter built on top. tests/e2e/support/mockApi.ts becomes a thin page.route adapter over the same handler (it keeps only the SSE /events fulfillment, which the jsdom suite replaces with an EventSource stub). The shared module lives under src/test/ because the lint-api-urls hook only permits raw /api/v1 strings in test directories, and it must not import @playwright/test types. Routes the fetch port had skipped (/repos/summary, legacy issue sync POST, legacy PR PATCH) were restored so the merged core is route-for-route identical to the old Playwright mock; spec-level page.route overrides still shadow the defaults since later registrations win. Validation: vp test run (2318 passed); playwright --list collects 332 tests in 19 files; repo-summary-results-label and theme-toggle specs run green in both browsers through the new adapter (the former exercises the newly shared /repos/summary route); typecheck errors identical to clean HEAD (pre-existing, none in touched files). Generated with Claude Code (claude-fable-5) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The testing policy read "E2E tests are non-negotiable" and described full-stack Playwright coverage as the bar for every change. That framing contradicts the direction this branch took — moving non-browser frontend coverage into the fast Vitest+jsdom lane — and would push the next contributor to keep writing browser specs for behavior (keyboard, focus, store-driven rendering, fetch shapes, routing) that jsdom covers in a fraction of the wall time. Reframe the mandate as: coverage of real behavior is non-negotiable, but the lane is chosen by whether a real browser is actually required. Playwright is reserved for rendering/layout/geometry/screenshot needs; everything else mounts the real app shell in jsdom and mocks the API at the fetch boundary. Also broadens the pre-push rule so a Vitest-only frontend change isn't gated on a Playwright suite that no longer covers it, while still requiring the Playwright run when specs or shared mock fixtures change. 🤖 Generated with Claude Code (claude-opus-4-8) Co-Authored-By: Claude Opus <noreply@anthropic.com>

… tests Converting the mock e2e suite to jsdom added src/test/mockApiFetch.ts as a sanctioned, low-friction way to mock the API — and folding the two e2e-full twins into it traded real-backend wire coverage (a Go-computed budget_spent rendered through the real /api/v1/rate-limits handler) for a hand-written fixture. Left undocumented, the easy mock path invites future authors to assert backend-computed values against a fixture that can drift from what Go actually emits, so both lanes stay green against a stale contract. Add a second decision axis, independent of browser-vs-jsdom: prefer the real Go backend and only mock pure-frontend behavior that is indifferent to the data source. Behavior depending on backend logic (sync, persistence, capabilities, normalization, wire shape) must run against real Go; features with both get split. Documents the rule rather than restoring the deleted e2e-full smokes or adding a contract guard (deferred). 🤖 Generated with Claude Code (claude-opus-4-8) Co-Authored-By: Claude Opus <noreply@anthropic.com>

The two prior edits ballooned the End-to-End Tests section into three dense paragraphs. CLAUDE.md is loaded into context every session, so verbosity has a standing cost; compress to the two decision axes (browser-only-when-required, real-Go-by-default) without losing the rules. 🤖 Generated with Claude Code (claude-opus-4-8) Co-Authored-By: Claude Opus <noreply@anthropic.com>

roborev (job 1764) flagged that the backend-coverage bullet referenced tests/e2e-full/, which does not resolve from the repo root — the full-stack Playwright tests live under frontend/tests/e2e-full/. Prefix the frontend-relative paths (e2e-full, appHarness, mockApiFetch) with frontend/ so the guidance points at real locations alongside the repo-root internal/server/ reference in the same bullet. 🤖 Generated with Claude Code (claude-opus-4-8) Co-Authored-By: Claude Opus <noreply@anthropic.com>

The converted Vitest files opened with "Converted from tests/e2e/..." / "Folded from tests/e2e-full/..." headers naming the Playwright specs they replaced. Those specs were deleted in the same change, so the comments were stale on arrival and only send a future reader chasing a path that no longer exists. Rewrite each header to describe what the file tests in the present tense; the cross-references to the still-present tests/e2e/support/mockApi.ts adapter are kept because that relationship is live. 🤖 Generated with Claude Code (claude-opus-4-8) Co-Authored-By: Claude Opus <noreply@anthropic.com>

…e and #475 head pinning Rebasing onto main pulled in two upstream behaviors the jsdom harness did not yet handle, leaving the whole app-mount suite stuck on Loading or asserting against a stale contract: - App startup now polls /healthz via waitUntilBackendReady before mounting any view. The Playwright lane gets that from the Vite dev healthcheck plugin, but the jsdom lane stubs fetch wholesale, so the adapter must answer /healthz (and /livez). It also has to resolve relative URLs the way a browser does: undici's new Request('/healthz') throws on a bare path, which the readiness loop swallowed and retried forever. - #475 added head pinning: approve now sends expected_head_sha, gated on a new mutation_head_binding capability and platform_head_sha/ reviewed_head_sha on the detail payload. Carry those fixture fields into the shared mock and restore the head-pin assertion (expected_head_sha == the rendered head) that #475 added to the deleted Playwright spec, so the converted Vitest test keeps that contract coverage. 🤖 Generated with Claude Code (claude-opus-4-8) Co-Authored-By: Claude Opus <noreply@anthropic.com>

roborev-ci · 2026-06-14T23:14:18Z

roborev: Combined Review (`c007521`)

Summary verdict: One medium issue should be fixed before merging; no critical or high findings were reported.

Medium

frontend/src/test/appHarness.ts:116
mountApp now mounts the real app shell, which starts the sync store polling interval after startup, but unmounting only removes the Svelte component/target. Since app cleanup disconnects events but does not stop stores.sync, leaked intervals can keep firing across tests and use whichever fetch stub is currently installed, polluting later assertions or keeping workers alive.

Fix: Stop sync polling during app teardown, for example by calling fullShellStores?.sync.stopPolling() in stopFullAppShell() before dropping the store reference.

Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 5m58s), codex_security (codex/security, done, 1m34s) | Total: 7m42s

runAppStartup starts stores.sync polling once the shell is ready, but its cancel function only aborts in-flight startup and stopFullAppShell only disconnected events — so the 30s poll interval kept firing after teardown. In production it leaks across full-shell -> embed-route navigation and on destroy; in the jsdom app-mount tests the orphaned interval fires against whichever fetch stub the next test installed, polluting assertions and keeping workers alive. stopFullAppShell now calls sync.stopPolling() (idempotent) before dropping the store reference. App.teardown.test.ts pins it: it counts live intervals across mount/unmount and fails if any survive (verified to fail without the stopPolling call). AppProviderMock gains a sync.stopPolling noop so the existing App.test.ts mock matches the store shape teardown now touches. 🤖 Generated with Claude Code (claude-opus-4-8) Co-Authored-By: Claude Opus <noreply@anthropic.com>

roborev-ci · 2026-06-15T00:53:44Z

roborev: Combined Review (`ceb8748`)

Summary verdict: One medium regression risk remains in frontend test coverage.

Medium

frontend/src/App.activity-collapse.test.ts:155
The compact collapse label visibility check was moved from Playwright to jsdom, but jsdom does not apply the component CSS. A regression such as removing or breaking .activity-feed--compact .collapse-all-label { display: none } would still pass while the narrow side-pane layout regresses.
Fix: Keep a minimal Playwright/browser assertion for compact label visibility, or assert computed style in a browser-backed test.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 8m29s), codex_security (codex/security, done, 52s) | Total: 9m28s

mariusvniekerk and others added 8 commits June 14, 2026 18:44

mariusvniekerk force-pushed the jolly-cartwright-5f6c01 branch from f820b20 to c007521 Compare June 14, 2026 23:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(frontend): move non-browser e2e coverage from Playwright to Vitest#500

test(frontend): move non-browser e2e coverage from Playwright to Vitest#500
mariusvniekerk wants to merge 9 commits into
mainfrom
jolly-cartwright-5f6c01

mariusvniekerk commented Jun 13, 2026

Uh oh!

roborev-ci Bot commented Jun 13, 2026

Uh oh!

roborev-ci Bot commented Jun 14, 2026

Uh oh!

roborev-ci Bot commented Jun 14, 2026

Uh oh!

roborev-ci Bot commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

mariusvniekerk commented Jun 13, 2026

Changes

Runtime

Uh oh!

roborev-ci Bot commented Jun 13, 2026

roborev: Combined Review (5f5bb67)

Medium

Uh oh!

roborev-ci Bot commented Jun 14, 2026

roborev: Combined Review (f820b20)

Medium

Uh oh!

roborev-ci Bot commented Jun 14, 2026

roborev: Combined Review (c007521)

Medium

Uh oh!

roborev-ci Bot commented Jun 15, 2026

roborev: Combined Review (ceb8748)

Medium

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

roborev: Combined Review (`5f5bb67`)

roborev: Combined Review (`f820b20`)

roborev: Combined Review (`c007521`)

roborev: Combined Review (`ceb8748`)