Skip to content

DO-NOT-MERGE: #657 revert for keyed-regression confirm#734

Closed
azchohfi wants to merge 1 commit into
mainfrom
revert-657
Closed

DO-NOT-MERGE: #657 revert for keyed-regression confirm#734
azchohfi wants to merge 1 commit into
mainfrom
revert-657

Conversation

@azchohfi

Copy link
Copy Markdown
Collaborator

Measurement-only: reverts #657 (628fb7f) off current main to confirm its keyed-list reconcile/diff regression as an inverse signal. Do not merge; will be torn down.

@azchohfi

Copy link
Copy Markdown
Collaborator Author

/perf

@github-actions

Copy link
Copy Markdown

⚡ Reactor perf comparison

Workload: StressPerf.ReactorOptimized StocksGrid · --percent 50 --duration 10 · x64 Release · median of 12 paired runs (2 warmup dropped); Δ is the mean change with a 95% CI · PR head and main built and run interleaved on the same runner.

Regression vs main baseline

Metric main (baseline) This PR Δ (95% CI) Status
Renders/sec ↑ 2.66 2.63 -1.5% 95% CI [-7.3, +4.2] ≈ within noise
Avg Reconcile (ms) ↓ 121.7 122.5 +2.6% 95% CI [-1.9, +7.1] ≈ within noise
Avg Diff (ms) ↓ 112.1 111.8 +2.4% 95% CI [-2.3, +7.2] ≈ within noise
Avg Memory (MB) ↓ 283.8 283.8 -0.2% 95% CI [-1.2, +0.8] ≈ within noise

Low-mutation skip-floor (--percent 0)

At --percent 0 the workload mutates few cells per tick (always at least one), so reconcile/diff isolate the O(n) per-tick child skip-walk floor that higher mutation rates dilute — ChildReconciler re-walks every child each tick even when nothing moved. The closer --percent is to 0, the more this floor is the signal, so a structural-skip optimization shows up cleanly where the headline table above buries it. Δ is the mean paired change with a 95% CI.

Metric main (baseline) This PR Δ (95% CI) Status
Renders/sec ↑ 16.59 16.48 -1.2% 95% CI [-8.5, +6.0] ≈ within noise
Avg Reconcile (ms) ↓ 37.1 35.0 -1.8% 95% CI [-7.7, +4.1] ≈ within noise
Avg Diff (ms) ↓ 35.0 33.0 -2.1% 95% CI [-8.2, +4.0] ≈ within noise
Avg Memory (MB) ↓ 266.0 265.4 -0.2% 95% CI [-0.5, +0.1] ≈ within noise

Allocation (Reactor) — lower is better

Metric main (baseline) This PR Δ (95% CI) Status
Alloc bytes/render ↓ 4848013 4884956 +1.4% 95% CI [+0.2, +2.6] ⚠️ regression
Gen0 GC / 1k renders ↓ 192.31 200.00 +8.1% 95% CI [-3.9, +20.1] ≈ within noise

Keyed-list workload (StressPerf.KeyedList, --percent 50)

A separate macro workload: a ~500-row stably keyed list whose rows are reordered / inserted / removed each tick. Because every child carries a key, the child reconciler takes its keyed arm (ReconcileKeyedReconcileKeyedMiddle, the LIS-based minimal-move pass) instead of the positional re-walk the StocksGrid tables above measure — so this is the sensitive macro signal for keyed-diff work the positional cells can never reach. Same interleaved paired-Δ 95% CI as the headline table.

Metric main (baseline) This PR Δ (95% CI) Status
Renders/sec ↑ 16.34 18.73 +16.1% 95% CI [+12.3, +20.0] ✅ improvement
Avg Reconcile (ms) ↓ 20.6 17.6 -15.6% 95% CI [-17.9, -13.4] ✅ improvement
Avg Diff (ms) ↓ 20.4 17.4 -15.6% 95% CI [-17.8, -13.3] ✅ improvement
Avg Memory (MB) ↓ 164.2 167.9 +1.7% 95% CI [+1.0, +2.4] ⚠️ regression

Allocation (keyed-list) — lower is better

Metric main (baseline) This PR Δ (95% CI) Status
Alloc bytes/render ↓ 216279 314216 +45.2% 95% CI [+44.4, +46.0] ⚠️ regression
Gen0 GC / 1k renders ↓ 11.83 15.67 +36.1% 95% CI [+27.7, +44.5] ⚠️ regression

Reconciler micro-benchmarks (PerfBench.ControlModel)

Production --variant Reactor control-model path, ns-resolution and WinUI-undiluted (spec-047 M1–M13) — ↓ lower is better. Status tracks allocated bytes/op, the authoritative signal here; it is deterministic for structurally-fixed benches, while dispatcher / background-thread benches carry a small process-to-process offset, so a bench is flagged only when its 95% CI clears a ±3% minimum-effect band (real structural alloc changes are several percent to many-x). ns/op is shown for context but is not auto-flagged (its paired CI is rep-interleaved but the flag remains dormant pending a real-CI identical-binary band calibration). Δ is the mean paired change with a 95% CI.

Bench main ns/op Δ ns (95% CI) main B/op Δ alloc (95% CI) Status
M1 Mount_Leaf_NoCallback 149999.2 +0.6% 95% CI [-2.3, +3.6] 1140.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
M2 Mount_Leaf_OneCallback 109267.4 -1.0% 95% CI [-5.7, +3.7] 3383.3 0.0% 95% CI [0.0, 0.0] ≈ within noise
M3 Mount_Leaf_ThreeCallbacks 225433.3 -1.8% 95% CI [-5.8, +2.1] 8395.4 +1.6% 95% CI [+0.2, +2.9] ≈ within noise
M4 Dispatch_Switch_Cold 112542.6 -3.6% 95% CI [-8.3, +1.0] 1767.8 0.0% 95% CI [0.0, 0.0] ≈ within noise
M5 Dispatch_Switch_Warm 111827.4 +1.1% 95% CI [-7.6, +9.7] 1805.9 -1.4% 95% CI [-3.6, +0.8] ≈ within noise
M6 Dispatch_ExternalType 91199.2 +0.6% 95% CI [-0.5, +1.6] 1028.6 -2.4% 95% CI [-6.4, +1.5] ≈ within noise
M7 Update_NoChange 55403.2 +0.2% 95% CI [-0.5, +0.8] 370.1 +8.4% 95% CI [-3.1, +19.8] ≈ within noise
M8 Update_OneLeafChanged 42066.4 +0.8% 95% CI [-2.0, +3.7] 536.0 0.0% 95% CI [0.0, 0.0] ≈ within noise
M9 Update_AllChanged 2884322.0 +0.1% 95% CI [-1.2, +1.4] 184278.1 0.0% 95% CI [0.0, 0.0] ≈ within noise
M10 EventHandlerState_Alloc 86233.9 -0.1% 95% CI [-2.6, +2.4] 3095.2 0.0% 95% CI [0.0, +0.1] ≈ within noise
M11 ModifierEHS_Frequency 45952.8 +1.3% 95% CI [-0.5, +3.2] 638.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
M12 Pool_Rent_HotPath 117647.9 +1.6% 95% CI [+0.1, +3.1] 1099.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
M13 Setters_Suppression_Scope 107.1 +28.9% 95% CI [+5.0, +52.8] 26.7 0.0% 95% CI [0.0, 0.0] ≈ within noise
M14 Dsl_Rebuild_Cascade 1580037.0 +0.7% 95% CI [-1.7, +3.1] 2231828.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
C207 ChangeHandler_DpRead_Coalesce 1262.9 +6.4% 95% CI [-9.9, +22.8] 0.6 0.0% 95% CI [0.0, 0.0] ≈ within noise
OAlloc Optional_Element_Alloc 214.5 +4.4% 95% CI [-2.6, +11.5] 528.0 0.0% 95% CI [0.0, 0.0] ≈ within noise
OUpdate Optional_Reconciler_Update 12558.9 -0.8% 95% CI [-3.1, +1.5] 2772.3 0.0% 95% CI [0.0, 0.0] ≈ within noise

Cross-framework reference (same StocksGrid workload)

Metric vanilla WinUI3¹ Rust windows-reactor² Reactor (this PR)
Renders/sec ↑ 3.06 4.65 2.63
Avg Reconcile (ms) ↓ n/a 19.7 122.5
Avg Diff (ms) ↓ n/a 18.3 111.8
Avg Memory (MB) ↓ 263.3 197.8 283.8

↑ higher is better · ↓ lower is better. Within noise = the 95% confidence interval of the paired Δ includes 0 (no change resolvable at this sample size); ✅ improvement / ⚠️ regression require the CI to exclude 0.
Allocation metrics (alloc bytes/render, Gen0 GC) are the sensitive signal for allocation-reduction work, where the mean-ms / memory figures are largely flat. They read n/a for a harness built from a revision that predates them (rebase the PR onto main to populate them).
Reconciler micro-benchmarks run PerfBench.ControlModel --variant Reactor (M1–M13) as a headless loop bracketed by per-thread alloc + GC counters — ns-resolution and free of WinUI render / working-set dilution, so they resolve Core/Reconciler allocation deltas the macro StocksGrid workload cannot. main and PR each link their own src/Reactor build and are rep-interleaved (a fresh alternated process per rep); Δ is the paired 95% CI over per-rep means. The Status column tracks allocated bytes/op (deterministic for identical code); ns/op is informational — its paired CI is now unbiased but the flag stays dormant pending a real-CI identical-binary band calibration.
¹ vanilla WinUI3 = StressPerf.Direct (imperative; no virtual-DOM, so it has no reconcile/diff phase — those cells read n/a). Measured live on this runner.
² Rust = test_reactor_perf from microsoft/windows-rs — a port of this harness (same StocksGrid, same --percent/--duration CLI). Built from source and measured live on this runner.
Absolute numbers are runner-dependent — trust the Δ vs main, not the absolute values. Memory (working set) is the noisiest metric.
Runner: CPU: AMD EPYC 7763 64-Core Processor · 4 logical cores · 16 GB RAM · runner: GitHub Actions 1043025925.
Generated by .github/workflows/perf-compare.yml · PR 5274c5a vs main 0002f19 · 2026-06-27T16:46:48Z · run log.

@azchohfi azchohfi closed this Jun 27, 2026
@azchohfi azchohfi deleted the revert-657 branch June 27, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant