DO-NOT-MERGE: #657 revert for keyed-regression confirm by azchohfi · Pull Request #734 · microsoft/microsoft-ui-reactor

azchohfi · 2026-06-27T16:06:48Z

Measurement-only: reverts #657 (628fb7f) off current main to confirm its keyed-list reconcile/diff regression as an inverse signal. Do not merge; will be torn down.

…edListDiff (#657)" This reverts commit 628fb7f.

azchohfi · 2026-06-27T16:06:49Z

/perf

github-actions · 2026-06-27T16:46:51Z

⚡ Reactor perf comparison

Workload: StressPerf.ReactorOptimized StocksGrid · --percent 50 --duration 10 · x64 Release · median of 12 paired runs (2 warmup dropped); Δ is the mean change with a 95% CI · PR head and main built and run interleaved on the same runner.

Regression vs `main` baseline

Metric	`main` (baseline)	This PR	Δ (95% CI)	Status
Renders/sec ↑	2.66	2.63	-1.5% _{95% CI [-7.3, +4.2]}	≈ within noise
Avg Reconcile (ms) ↓	121.7	122.5	+2.6% _{95% CI [-1.9, +7.1]}	≈ within noise
Avg Diff (ms) ↓	112.1	111.8	+2.4% _{95% CI [-2.3, +7.2]}	≈ within noise
Avg Memory (MB) ↓	283.8	283.8	-0.2% _{95% CI [-1.2, +0.8]}	≈ within noise

Low-mutation skip-floor (`--percent 0`)

At --percent 0 the workload mutates few cells per tick (always at least one), so reconcile/diff isolate the O(n) per-tick child skip-walk floor that higher mutation rates dilute — ChildReconciler re-walks every child each tick even when nothing moved. The closer --percent is to 0, the more this floor is the signal, so a structural-skip optimization shows up cleanly where the headline table above buries it. Δ is the mean paired change with a 95% CI.

Metric	`main` (baseline)	This PR	Δ (95% CI)	Status
Renders/sec ↑	16.59	16.48	-1.2% _{95% CI [-8.5, +6.0]}	≈ within noise
Avg Reconcile (ms) ↓	37.1	35.0	-1.8% _{95% CI [-7.7, +4.1]}	≈ within noise
Avg Diff (ms) ↓	35.0	33.0	-2.1% _{95% CI [-8.2, +4.0]}	≈ within noise
Avg Memory (MB) ↓	266.0	265.4	-0.2% _{95% CI [-0.5, +0.1]}	≈ within noise

Allocation (Reactor) — lower is better

Metric	`main` (baseline)	This PR	Δ (95% CI)	Status
Alloc bytes/render ↓	4848013	4884956	+1.4% _{95% CI [+0.2, +2.6]}	⚠️ regression
Gen0 GC / 1k renders ↓	192.31	200.00	+8.1% _{95% CI [-3.9, +20.1]}	≈ within noise

Keyed-list workload (`StressPerf.KeyedList`, `--percent 50`)

A separate macro workload: a ~500-row stably keyed list whose rows are reordered / inserted / removed each tick. Because every child carries a key, the child reconciler takes its keyed arm (ReconcileKeyed → ReconcileKeyedMiddle, the LIS-based minimal-move pass) instead of the positional re-walk the StocksGrid tables above measure — so this is the sensitive macro signal for keyed-diff work the positional cells can never reach. Same interleaved paired-Δ 95% CI as the headline table.

Metric	`main` (baseline)	This PR	Δ (95% CI)	Status
Renders/sec ↑	16.34	18.73	+16.1% _{95% CI [+12.3, +20.0]}	✅ improvement
Avg Reconcile (ms) ↓	20.6	17.6	-15.6% _{95% CI [-17.9, -13.4]}	✅ improvement
Avg Diff (ms) ↓	20.4	17.4	-15.6% _{95% CI [-17.8, -13.3]}	✅ improvement
Avg Memory (MB) ↓	164.2	167.9	+1.7% _{95% CI [+1.0, +2.4]}	⚠️ regression

Allocation (keyed-list) — lower is better

Metric	`main` (baseline)	This PR	Δ (95% CI)	Status
Alloc bytes/render ↓	216279	314216	+45.2% _{95% CI [+44.4, +46.0]}	⚠️ regression
Gen0 GC / 1k renders ↓	11.83	15.67	+36.1% _{95% CI [+27.7, +44.5]}	⚠️ regression

Reconciler micro-benchmarks (`PerfBench.ControlModel`)

Production --variant Reactor control-model path, ns-resolution and WinUI-undiluted (spec-047 M1–M13) — ↓ lower is better. Status tracks allocated bytes/op, the authoritative signal here; it is deterministic for structurally-fixed benches, while dispatcher / background-thread benches carry a small process-to-process offset, so a bench is flagged only when its 95% CI clears a ±3% minimum-effect band (real structural alloc changes are several percent to many-x). ns/op is shown for context but is not auto-flagged (its paired CI is rep-interleaved but the flag remains dormant pending a real-CI identical-binary band calibration). Δ is the mean paired change with a 95% CI.

Bench	`main` ns/op	Δ ns (95% CI)	`main` B/op	Δ alloc (95% CI)	Status
`M1` Mount_Leaf_NoCallback	149999.2	+0.6% _{95% CI [-2.3, +3.6]}	1140.9	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M2` Mount_Leaf_OneCallback	109267.4	-1.0% _{95% CI [-5.7, +3.7]}	3383.3	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M3` Mount_Leaf_ThreeCallbacks	225433.3	-1.8% _{95% CI [-5.8, +2.1]}	8395.4	+1.6% _{95% CI [+0.2, +2.9]}	≈ within noise
`M4` Dispatch_Switch_Cold	112542.6	-3.6% _{95% CI [-8.3, +1.0]}	1767.8	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M5` Dispatch_Switch_Warm	111827.4	+1.1% _{95% CI [-7.6, +9.7]}	1805.9	-1.4% _{95% CI [-3.6, +0.8]}	≈ within noise
`M6` Dispatch_ExternalType	91199.2	+0.6% _{95% CI [-0.5, +1.6]}	1028.6	-2.4% _{95% CI [-6.4, +1.5]}	≈ within noise
`M7` Update_NoChange	55403.2	+0.2% _{95% CI [-0.5, +0.8]}	370.1	+8.4% _{95% CI [-3.1, +19.8]}	≈ within noise
`M8` Update_OneLeafChanged	42066.4	+0.8% _{95% CI [-2.0, +3.7]}	536.0	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M9` Update_AllChanged	2884322.0	+0.1% _{95% CI [-1.2, +1.4]}	184278.1	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M10` EventHandlerState_Alloc	86233.9	-0.1% _{95% CI [-2.6, +2.4]}	3095.2	0.0% _{95% CI [0.0, +0.1]}	≈ within noise
`M11` ModifierEHS_Frequency	45952.8	+1.3% _{95% CI [-0.5, +3.2]}	638.9	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M12` Pool_Rent_HotPath	117647.9	+1.6% _{95% CI [+0.1, +3.1]}	1099.9	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M13` Setters_Suppression_Scope	107.1	+28.9% _{95% CI [+5.0, +52.8]}	26.7	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`M14` Dsl_Rebuild_Cascade	1580037.0	+0.7% _{95% CI [-1.7, +3.1]}	2231828.9	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`C207` ChangeHandler_DpRead_Coalesce	1262.9	+6.4% _{95% CI [-9.9, +22.8]}	0.6	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`OAlloc` Optional_Element_Alloc	214.5	+4.4% _{95% CI [-2.6, +11.5]}	528.0	0.0% _{95% CI [0.0, 0.0]}	≈ within noise
`OUpdate` Optional_Reconciler_Update	12558.9	-0.8% _{95% CI [-3.1, +1.5]}	2772.3	0.0% _{95% CI [0.0, 0.0]}	≈ within noise

Cross-framework reference (same StocksGrid workload)

Metric	vanilla WinUI3¹	Rust `windows-reactor`²	Reactor (this PR)
Renders/sec ↑	3.06	4.65	2.63
Avg Reconcile (ms) ↓	n/a	19.7	122.5
Avg Diff (ms) ↓	n/a	18.3	111.8
Avg Memory (MB) ↓	263.3	197.8	283.8

_{↑ higher is better · ↓ lower is better. Within noise = the 95% confidence interval of the paired Δ includes 0 (no change resolvable at this sample size); ✅ improvement / ⚠️ regression require the CI to exclude 0.}
_{Allocation metrics (alloc bytes/render, Gen0 GC) are the sensitive signal for allocation-reduction work, where the mean-ms / memory figures are largely flat. They read n/a for a harness built from a revision that predates them (rebase the PR onto main to populate them).}
_{Reconciler micro-benchmarks run PerfBench.ControlModel --variant Reactor (M1–M13) as a headless loop bracketed by per-thread alloc + GC counters — ns-resolution and free of WinUI render / working-set dilution, so they resolve Core/Reconciler allocation deltas the macro StocksGrid workload cannot. main and PR each link their own src/Reactor build and are rep-interleaved (a fresh alternated process per rep); Δ is the paired 95% CI over per-rep means. The Status column tracks allocated bytes/op (deterministic for identical code); ns/op is informational — its paired CI is now unbiased but the flag stays dormant pending a real-CI identical-binary band calibration.}
_{¹ vanilla WinUI3 = StressPerf.Direct (imperative; no virtual-DOM, so it has no reconcile/diff phase — those cells read n/a). Measured live on this runner.}
_{² Rust = test_reactor_perf from microsoft/windows-rs — a port of this harness (same StocksGrid, same --percent/--duration CLI). Built from source and measured live on this runner.}
_{Absolute numbers are runner-dependent — trust the Δ vs main, not the absolute values. Memory (working set) is the noisiest metric.}
_{Runner: CPU: AMD EPYC 7763 64-Core Processor · 4 logical cores · 16 GB RAM · runner: GitHub Actions 1043025925.}
_{Generated by .github/workflows/perf-compare.yml · PR 5274c5a vs main 0002f19 · 2026-06-27T16:46:48Z · run log.}

Revert "perf: eliminate per-diff allocations in ChildReconciler + Key…

5274c5a

…edListDiff (#657)" This reverts commit 628fb7f.

azchohfi closed this Jun 27, 2026

azchohfi deleted the revert-657 branch June 27, 2026 16:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DO-NOT-MERGE: #657 revert for keyed-regression confirm#734

DO-NOT-MERGE: #657 revert for keyed-regression confirm#734
azchohfi wants to merge 1 commit into
mainfrom
revert-657

azchohfi commented Jun 27, 2026

Uh oh!

azchohfi commented Jun 27, 2026

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

azchohfi commented Jun 27, 2026

Uh oh!

azchohfi commented Jun 27, 2026

Uh oh!

github-actions Bot commented Jun 27, 2026

⚡ Reactor perf comparison

Regression vs main baseline

Low-mutation skip-floor (--percent 0)

Allocation (Reactor) — lower is better

Keyed-list workload (StressPerf.KeyedList, --percent 50)

Reconciler micro-benchmarks (PerfBench.ControlModel)

Cross-framework reference (same StocksGrid workload)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Regression vs `main` baseline

Low-mutation skip-floor (`--percent 0`)

Keyed-list workload (`StressPerf.KeyedList`, `--percent 50`)

Reconciler micro-benchmarks (`PerfBench.ControlModel`)