Skip to content

Rebuild /compare to diff two extraction.json runs side-by-side#17

Open
Khaostica wants to merge 1 commit into
aglover1221:mainfrom
Khaostica:feat/compare-extractions
Open

Rebuild /compare to diff two extraction.json runs side-by-side#17
Khaostica wants to merge 1 commit into
aglover1221:mainfrom
Khaostica:feat/compare-extractions

Conversation

@Khaostica
Copy link
Copy Markdown

Summary

Rebuilds /compare from a placeholder into a real side-by-side diff viewer for two extraction.json trees. Walks every field, preserves per-value evidence blocks (source, page, quote, confidence), and surfaces confidence deltas.

What's in here

  • lib/pipeline/extraction-diff.ts — pure diffExtractions(left, right) function. Handles { value, evidence } scalars, scalar sets (multiset semantics — reorder is unchanged), and keyed row lists (slug | id | sku | name | code, with positional fallback). Identity keys (vendor, slug, extraction_metadata, sources, …) are excluded from diff rows and surfaced as page-header context.
  • app/api/compare/route.ts — GET /api/compare?left=<slug>&right=<slug>. 400 on missing/equal params, 404 on unresolved slug, 409 on cross-category attempt.
  • app/compare/page.tsx + _components/{ComparePicker,DiffTable}.tsx — server-component shell with URL-synced client pickers, summary pills, collapse-unchanged toggle, expand-evidence cells, keyed-list groups.
  • Navbar entry — /compare between Audit and Usage.
  • Tests — 13 unit cases in tests/unit/extraction-diff.test.ts covering identity capture, identical-tree no-op, scalar change with evidence, null-equality, added/removed asymmetry, set diff with reorder, keyed-row diff by slug, positional fallback, and confidence-delta semantics.

Run-id comparison (two runs of the same product) is intentionally deferred; see the linked issue for the follow-up.

Related Issues

Closes #16.

Type of Change

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • Other (describe below)

Testing

  • Tests added/updated — tests/unit/extraction-diff.test.ts, 13 cases, all green.
  • Manually tested — npx tsc --noEmit clean; npx vitest run 48/48; npx next build succeeds and registers /compare + /api/compare. npm run dev shows the empty-state on the bundled sample (only one extraction on disk); the diff path is covered by unit tests until a second sample lands.

Checklist

  • Code follows project style guidelines (vendor-neutral, schema-driven, evidence-preserving — per CONTRIBUTING).
  • Self-review completed.
  • Changes are documented (inline doc comments on the public surface of extraction-diff.ts and the API route).

Replaces the placeholder /compare page with a real side-by-side viewer that
walks both extraction trees, preserves per-value evidence, and reports
confidence deltas.

- lib/pipeline/extraction-diff.ts: pure diff function over {value, evidence}
  fields, scalar sets (workload_tags etc), and keyed row lists (cpu_skus,
  pcie_slots) with slug/id/sku/name preference and positional fallback.
- app/api/compare/route.ts: GET /api/compare?left=<slug>&right=<slug>.
- app/compare/page.tsx + _components/{ComparePicker,DiffTable}.tsx: server
  component shell with client-side pickers, collapse-unchanged toggle, and
  expand-evidence cells.
- tests/unit/extraction-diff.test.ts: 13 unit tests covering identity, scalar
  changes, add/remove, set diffs, keyed-row diffs, positional fallback, and
  confidence-delta semantics.

Cross-category comparison is rejected by both the API (409) and the page;
run-id comparison (two runs of the same product) is intentionally deferred.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rebuild /compare to diff two extraction.json runs side-by-side

2 participants