Skip to content

test(frontend): move non-browser e2e coverage from Playwright to Vitest#500

Open
mariusvniekerk wants to merge 9 commits into
mainfrom
jolly-cartwright-5f6c01
Open

test(frontend): move non-browser e2e coverage from Playwright to Vitest#500
mariusvniekerk wants to merge 9 commits into
mainfrom
jolly-cartwright-5f6c01

Conversation

@mariusvniekerk

Copy link
Copy Markdown
Collaborator

The mock-API Playwright lane paid full-browser cost — two browser projects plus a Vite dev server per run — for behavior that needs no real rendering: keyboard dispatch, focus, store-driven DOM, fetch request shapes, and URL routing. That coverage now runs in the Vitest jsdom lane that already executes on every change, mounting the real app shell (real Provider, stores, views, keyboard dispatch) with the API mocked at the fetch boundary. A roborev finding on the first commit (the jsdom mock began as a copy of the Playwright mock) is addressed in the second: both suites now consume one fixture/route core, so they cannot drift.

Changes

  • Converted 9 mock-suite specs (36 cases) into 37 Vitest cases: activity collapse, activity thread runs, budget display, issue routing, palette commands, palette focus trap, PR-detail palette commands, cheatsheet, keyboard shortcuts.
  • Folded the two tests/e2e-full duplicates (activity collapse, budget display) into the same Vitest files and deleted them.
  • Added src/test/appHarness.ts (mounts the real App.svelte in jsdom) and made src/test/mockApiFetch.ts the single mock-API source; tests/e2e/support/mockApi.ts is now a thin page.route adapter over the shared handler.
  • Specs that genuinely need a browser (screenshots, video, geometry, computed CSS, mouse drag) stay in Playwright.

Runtime

Measured back-to-back on the same machine (matched load):

Lane What runs Wall time
Before: Playwright (9 specs) 36 cases × chromium + firefox, incl. Vite dev-server boot 22s
After: Vitest (same coverage, 9 files) 37 cases ~4.5s

The converted cases also join the vp test lane that already runs on every change, so their marginal cost there is near zero (full suite stays ~12s wall for 2,318 tests), and CI sheds the browser provisioning and dev-server boot for this coverage entirely.

Validation: full Vitest suite 2,318 passed; full mock Playwright suite after the final edit 320 passed / 8 pre-existing skips; typecheck identical to clean-HEAD baseline.

🤖 Generated with Claude Code

generated by a clanker

@roborev-ci

roborev-ci Bot commented Jun 13, 2026

Copy link
Copy Markdown

roborev: Combined Review (5f5bb67)

Medium findings remain: the PR drops required full-stack e2e coverage for activity collapse and budget display.

Medium

  • frontend/src/App.activity-collapse.test.ts:180
    The full-stack Playwright coverage for threaded activity collapse was replaced with a jsdom test that mocks /settings and /activity, so collapse/expand is no longer exercised through the real HTTP API and SQLite seed data.
    Fix: Restore a minimal tests/e2e-full/activity-collapse.spec.ts that switches to threaded mode and verifies collapse/expand against the real e2e server.

  • frontend/src/lib/components/layout/StatusBar.budget.test.ts:81
    The budget display full-stack tests were removed and replaced with mountApp plus mocked rate-limit responses, leaving no asserted e2e coverage that the real /api/v1/rate-limits data reaches the status bar/popover.
    Fix: Keep a small Playwright e2e-full budget test that visits /pulls, waits for seeded rate-limit data, opens the popover, and asserts the REST/GQL/budget values.


Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 6m34s), codex_security (codex/security, done, 1m7s) | Total: 7m50s

@roborev-ci

roborev-ci Bot commented Jun 14, 2026

Copy link
Copy Markdown

roborev: Combined Review (f820b20)

Mixed verdict: no security issues found, but there are medium-risk test reliability and coverage regressions to address.

Medium

  • frontend/src/test/appHarness.ts:116
    mountApp unmounts the Svelte tree, but app startup begins sync polling and that poller is not stopped during teardown. Stale intervals can keep firing into later tests’ mocked fetch handles, or real fetch after vi.unstubAllGlobals(), which can pollute request assertions and make Vitest flaky.
    Fix: Stop stores.sync.stopPolling() as part of full app teardown, or expose and stop the created store before removing globals.

  • frontend/src/App.activity-collapse.test.ts:1
    Activity-collapse coverage was moved from real HTTP/SQLite e2e to hand-written mocked activity/settings payloads, even though the seeded full-stack server already produced this flow. This drops coverage for frontend/backend wire-shape regressions in a user-visible activity workflow.
    Fix: Keep a focused e2e-full smoke test for threaded collapse/expand against the real server, and use Vitest mock tests only for extra UI-only states.

  • frontend/src/App.activity-collapse.test.ts:155
    The compact side-pane test no longer verifies that the collapse label is hidden; it only checks that the element exists under .activity-feed--compact. A CSS regression that makes the label visible and breaks the compact layout would now pass.
    Fix: Keep this assertion in a browser test using toBeHidden or computed style, since this is rendering/layout behavior.


Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 5m43s), codex_security (codex/security, done, 16s) | Total: 6m9s

mariusvniekerk and others added 8 commits June 14, 2026 18:44
The mock-API Playwright suite paid full-browser cost for behavior that
needs no real rendering: keyboard dispatch, focus, store-driven DOM,
fetch request shapes, and URL routing all work in jsdom. Those specs now
run as Vitest component tests that mount the real App shell (real
Provider, stores, and views) with the API mocked at the fetch boundary,
via a new shared harness (src/test/appHarness.ts) and a fetch port of
the Playwright mockApi fixtures (src/test/mockApiFetch.ts).

Converted specs (deleted from tests/e2e):
- activity-collapse.spec.ts -> src/App.activity-collapse.test.ts
  (also folds tests/e2e-full/activity-collapse.spec.ts)
- activity-thread-runs.spec.ts -> src/App.activity-thread-runs.test.ts
- budget-display.spec.ts -> src/lib/components/layout/StatusBar.budget.test.ts
  (also folds tests/e2e-full/budget-display.spec.ts)
- issue-routing.spec.ts -> src/App.issue-routing.test.ts
- palette-commands.spec.ts -> src/App.palette-commands.test.ts
- palette-focus-trap.spec.ts -> src/App.palette-focus-trap.test.ts
- palette-pr-detail-commands.spec.ts -> src/App.palette-pr-detail-commands.test.ts
- cheatsheet.spec.ts -> src/App.cheatsheet.test.ts
- keyboard-shortcuts-migration.spec.ts -> src/App.keyboard-shortcuts.test.ts

Two intent adaptations: Enter/Space popover activation collapses into a
native-button assertion because jsdom does not synthesize click from
key activation, and color checks assert the inline var(--budget-red)
token instead of computed rgb pixels.

Validation: vp test run (2318 passed), playwright --list on the mock
config collects 332 tests in 19 files; typecheck failures are identical
to clean HEAD (pre-existing, none in new files).

Generated with Claude Code (claude-fable-5)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The Vitest harness's mockApiFetch.ts started as a port of the Playwright
suite's mockApi.ts, leaving the same fixtures and /api/v1 route matching
defined twice. The two copies could drift silently (route shapes, provider
identity fields, capability flags), making the suites disagree about the
API contract they mock. Flagged by roborev review job 1703 on 2c43481.

src/test/mockApiFetch.ts is now the single source: fixtures plus a
transport-neutral createMockApiHandler, with the fetch adapter built on
top. tests/e2e/support/mockApi.ts becomes a thin page.route adapter over
the same handler (it keeps only the SSE /events fulfillment, which the
jsdom suite replaces with an EventSource stub). The shared module lives
under src/test/ because the lint-api-urls hook only permits raw /api/v1
strings in test directories, and it must not import @playwright/test
types. Routes the fetch port had skipped (/repos/summary, legacy issue
sync POST, legacy PR PATCH) were restored so the merged core is
route-for-route identical to the old Playwright mock; spec-level
page.route overrides still shadow the defaults since later registrations
win.

Validation: vp test run (2318 passed); playwright --list collects 332
tests in 19 files; repo-summary-results-label and theme-toggle specs run
green in both browsers through the new adapter (the former exercises the
newly shared /repos/summary route); typecheck errors identical to clean
HEAD (pre-existing, none in touched files).

Generated with Claude Code (claude-fable-5)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The testing policy read "E2E tests are non-negotiable" and described
full-stack Playwright coverage as the bar for every change. That framing
contradicts the direction this branch took — moving non-browser frontend
coverage into the fast Vitest+jsdom lane — and would push the next
contributor to keep writing browser specs for behavior (keyboard, focus,
store-driven rendering, fetch shapes, routing) that jsdom covers in a
fraction of the wall time.

Reframe the mandate as: coverage of real behavior is non-negotiable, but
the lane is chosen by whether a real browser is actually required.
Playwright is reserved for rendering/layout/geometry/screenshot needs;
everything else mounts the real app shell in jsdom and mocks the API at
the fetch boundary. Also broadens the pre-push rule so a Vitest-only
frontend change isn't gated on a Playwright suite that no longer covers
it, while still requiring the Playwright run when specs or shared mock
fixtures change.

🤖 Generated with Claude Code (claude-opus-4-8)
Co-Authored-By: Claude Opus <noreply@anthropic.com>
… tests

Converting the mock e2e suite to jsdom added src/test/mockApiFetch.ts as a
sanctioned, low-friction way to mock the API — and folding the two
e2e-full twins into it traded real-backend wire coverage (a Go-computed
budget_spent rendered through the real /api/v1/rate-limits handler) for a
hand-written fixture. Left undocumented, the easy mock path invites future
authors to assert backend-computed values against a fixture that can drift
from what Go actually emits, so both lanes stay green against a stale
contract.

Add a second decision axis, independent of browser-vs-jsdom: prefer the
real Go backend and only mock pure-frontend behavior that is indifferent to
the data source. Behavior depending on backend logic (sync, persistence,
capabilities, normalization, wire shape) must run against real Go; features
with both get split. Documents the rule rather than restoring the deleted
e2e-full smokes or adding a contract guard (deferred).

🤖 Generated with Claude Code (claude-opus-4-8)
Co-Authored-By: Claude Opus <noreply@anthropic.com>
The two prior edits ballooned the End-to-End Tests section into three
dense paragraphs. CLAUDE.md is loaded into context every session, so
verbosity has a standing cost; compress to the two decision axes
(browser-only-when-required, real-Go-by-default) without losing the
rules.

🤖 Generated with Claude Code (claude-opus-4-8)
Co-Authored-By: Claude Opus <noreply@anthropic.com>
roborev (job 1764) flagged that the backend-coverage bullet referenced
tests/e2e-full/, which does not resolve from the repo root — the
full-stack Playwright tests live under frontend/tests/e2e-full/. Prefix
the frontend-relative paths (e2e-full, appHarness, mockApiFetch) with
frontend/ so the guidance points at real locations alongside the
repo-root internal/server/ reference in the same bullet.

🤖 Generated with Claude Code (claude-opus-4-8)
Co-Authored-By: Claude Opus <noreply@anthropic.com>
The converted Vitest files opened with "Converted from
tests/e2e/..." / "Folded from tests/e2e-full/..." headers naming the
Playwright specs they replaced. Those specs were deleted in the same
change, so the comments were stale on arrival and only send a future
reader chasing a path that no longer exists. Rewrite each header to
describe what the file tests in the present tense; the cross-references
to the still-present tests/e2e/support/mockApi.ts adapter are kept
because that relationship is live.

🤖 Generated with Claude Code (claude-opus-4-8)
Co-Authored-By: Claude Opus <noreply@anthropic.com>
…e and #475 head pinning

Rebasing onto main pulled in two upstream behaviors the jsdom harness
did not yet handle, leaving the whole app-mount suite stuck on Loading or
asserting against a stale contract:

- App startup now polls /healthz via waitUntilBackendReady before mounting
  any view. The Playwright lane gets that from the Vite dev healthcheck
  plugin, but the jsdom lane stubs fetch wholesale, so the adapter must
  answer /healthz (and /livez). It also has to resolve relative URLs the
  way a browser does: undici's new Request('/healthz') throws on a bare
  path, which the readiness loop swallowed and retried forever.
- #475 added head pinning: approve now sends expected_head_sha, gated on a
  new mutation_head_binding capability and platform_head_sha/
  reviewed_head_sha on the detail payload. Carry those fixture fields into
  the shared mock and restore the head-pin assertion (expected_head_sha ==
  the rendered head) that #475 added to the deleted Playwright spec, so the
  converted Vitest test keeps that contract coverage.

🤖 Generated with Claude Code (claude-opus-4-8)
Co-Authored-By: Claude Opus <noreply@anthropic.com>
@mariusvniekerk mariusvniekerk force-pushed the jolly-cartwright-5f6c01 branch from f820b20 to c007521 Compare June 14, 2026 23:07
@roborev-ci

roborev-ci Bot commented Jun 14, 2026

Copy link
Copy Markdown

roborev: Combined Review (c007521)

Summary verdict: One medium issue should be fixed before merging; no critical or high findings were reported.

Medium

  • frontend/src/test/appHarness.ts:116
    mountApp now mounts the real app shell, which starts the sync store polling interval after startup, but unmounting only removes the Svelte component/target. Since app cleanup disconnects events but does not stop stores.sync, leaked intervals can keep firing across tests and use whichever fetch stub is currently installed, polluting later assertions or keeping workers alive.

    Fix: Stop sync polling during app teardown, for example by calling fullShellStores?.sync.stopPolling() in stopFullAppShell() before dropping the store reference.


Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 5m58s), codex_security (codex/security, done, 1m34s) | Total: 7m42s

runAppStartup starts stores.sync polling once the shell is ready, but its
cancel function only aborts in-flight startup and stopFullAppShell only
disconnected events — so the 30s poll interval kept firing after teardown.
In production it leaks across full-shell -> embed-route navigation and on
destroy; in the jsdom app-mount tests the orphaned interval fires against
whichever fetch stub the next test installed, polluting assertions and
keeping workers alive.

stopFullAppShell now calls sync.stopPolling() (idempotent) before dropping
the store reference. App.teardown.test.ts pins it: it counts live intervals
across mount/unmount and fails if any survive (verified to fail without the
stopPolling call). AppProviderMock gains a sync.stopPolling noop so the
existing App.test.ts mock matches the store shape teardown now touches.

🤖 Generated with Claude Code (claude-opus-4-8)
Co-Authored-By: Claude Opus <noreply@anthropic.com>
@roborev-ci

roborev-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown

roborev: Combined Review (ceb8748)

Summary verdict: One medium regression risk remains in frontend test coverage.

Medium

  • frontend/src/App.activity-collapse.test.ts:155
    The compact collapse label visibility check was moved from Playwright to jsdom, but jsdom does not apply the component CSS. A regression such as removing or breaking .activity-feed--compact .collapse-all-label { display: none } would still pass while the narrow side-pane layout regresses.
    Fix: Keep a minimal Playwright/browser assertion for compact label visibility, or assert computed style in a browser-backed test.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 8m29s), codex_security (codex/security, done, 52s) | Total: 9m28s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant