perf(reconciler): skip redundant element-modifier self-merge (P1)#692
Conversation
45f5916 to
fce3d26
Compare
|
/perf |
There was a problem hiding this comment.
Pull request overview
This PR is a Phase-2 reconciler update-path performance change aimed at reducing per-element allocations and work during Reconciler.Update, especially for large-grid workloads.
Changes:
- Avoids redundant
x.Merge(x)allocations when resolved modifiers already reference the element’s ownElementModifiers. - Adds a new
modifiersEqualcomputation and anApplyModifiersskip fast-path when resolved modifiers are structurally equal (with anOnUpdateActionexception). - Adds a new headless test suite to pin the self-merge allocation regression and the
OnUpdateActionexception behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/Reactor/Core/Reconciler.Update.cs | Adds ReferenceEquals guard for modifier self-merge and introduces the ApplyModifiers fast-path via modifiersEqual / ShouldApplyModifiers. |
| tests/Reactor.Tests/ReconcilerModifierMergeTests.cs | New tests covering self-merge semantics, allocation regression “teeth”, and ShouldApplyModifiers behavior around OnUpdateAction. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
⚡ Reactor perf comparisonWorkload: Regression vs
|
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Renders/sec ↑ | 2.38 | 2.59 | +6.7% 95% CI [-0.9, +14.3] | ≈ within noise |
| Avg Reconcile (ms) ↓ | 149.6 | 137.7 | -8.6% 95% CI [-12.5, -4.7] | ✅ improvement |
| Avg Diff (ms) ↓ | 137.0 | 125.3 | -9.6% 95% CI [-13.7, -5.6] | ✅ improvement |
| Avg Memory (MB) ↓ | 293.8 | 288.6 | -1.6% 95% CI [-2.7, -0.5] | ✅ improvement |
Low-mutation skip-floor (--percent 0)
At --percent 0 the workload mutates few cells per tick (always at least one), so reconcile/diff isolate the O(n) per-tick child skip-walk floor that higher mutation rates dilute — ChildReconciler re-walks every child each tick even when nothing moved. The closer --percent is to 0, the more this floor is the signal, so a structural-skip optimization shows up cleanly where the headline table above buries it. Δ is the mean paired change with a 95% CI.
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Renders/sec ↑ | 15.99 | 16.02 | +3.5% 95% CI [-3.6, +10.6] | ≈ within noise |
| Avg Reconcile (ms) ↓ | 36.2 | 37.0 | +2.2% 95% CI [-7.5, +11.9] | ≈ within noise |
| Avg Diff (ms) ↓ | 34.0 | 34.8 | +2.4% 95% CI [-7.8, +12.5] | ≈ within noise |
| Avg Memory (MB) ↓ | 266.8 | 267.7 | +0.1% 95% CI [-0.3, +0.6] | ≈ within noise |
Allocation (Reactor) — lower is better
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Alloc bytes/render ↓ | 9603489 | 5739023 | -40.2% 95% CI [-41.0, -39.5] | ✅ improvement |
| Gen0 GC / 1k renders ↓ | 291.67 | 277.93 | -4.3% 95% CI [-12.8, +4.3] | ≈ within noise |
Reconciler micro-benchmarks (PerfBench.ControlModel)
Production --variant Reactor control-model path, ns-resolution and WinUI-undiluted (spec-047 M1–M13) — ↓ lower is better. Status tracks allocated bytes/op, the authoritative signal here; it is deterministic for structurally-fixed benches, while dispatcher / background-thread benches carry a small process-to-process offset, so a bench is flagged only when its 95% CI clears a ±3% minimum-effect band (real structural alloc changes are several percent to many-x). ns/op is shown for context but is not auto-flagged (its paired CI is rep-interleaved but the flag remains dormant pending a real-CI identical-binary band calibration). Δ is the mean paired change with a 95% CI.
| Bench | main ns/op |
Δ ns (95% CI) | main B/op |
Δ alloc (95% CI) | Status |
|---|---|---|---|---|---|
M1 Mount_Leaf_NoCallback |
153923.4 | -1.2% 95% CI [-7.6, +5.3] | 1140.9 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M2 Mount_Leaf_OneCallback |
112027.3 | -2.7% 95% CI [-8.9, +3.5] | 3383.3 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M3 Mount_Leaf_ThreeCallbacks |
234735.3 | +2.0% 95% CI [-4.2, +8.1] | 8429.9 | -0.5% 95% CI [-2.8, +1.9] | ≈ within noise |
M4 Dispatch_Switch_Cold |
107320.8 | +3.0% 95% CI [-3.1, +9.1] | 1767.8 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M5 Dispatch_Switch_Warm |
109536.2 | +1.9% 95% CI [-5.5, +9.3] | 1766.0 | 0.0% 95% CI [-1.2, +1.2] | ≈ within noise |
M6 Dispatch_ExternalType |
91878.5 | +0.2% 95% CI [-3.5, +3.8] | 987.6 | +0.1% 95% CI [-2.1, +2.2] | ≈ within noise |
M7 Update_NoChange |
56791.0 | -0.5% 95% CI [-5.3, +4.3] | 452.1 | +0.7% 95% CI [-7.1, +8.4] | ≈ within noise |
M8 Update_OneLeafChanged |
42189.1 | -1.6% 95% CI [-4.6, +1.4] | 536.0 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M9 Update_AllChanged |
2924667.5 | +3.8% 95% CI [-4.9, +12.6] | 184278.1 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M10 EventHandlerState_Alloc |
88743.6 | -1.4% 95% CI [-3.5, +0.6] | 3095.2 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M11 ModifierEHS_Frequency |
46891.4 | -2.0% 95% CI [-8.9, +5.0] | 638.9 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M12 Pool_Rent_HotPath |
119522.6 | +0.1% 95% CI [-4.6, +4.9] | 1099.9 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M13 Setters_Suppression_Scope |
150.0 | -15.6% 95% CI [-24.7, -6.4] | 26.7 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
C207 ChangeHandler_DpRead_Coalesce |
1390.3 | -0.2% 95% CI [-6.8, +6.4] | 0.6 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
OAlloc Optional_Element_Alloc |
215.7 | +2.8% 95% CI [-14.9, +20.6] | 528.0 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
OUpdate Optional_Reconciler_Update |
17256.2 | -29.6% 95% CI [-31.4, -27.8] | 5810.8 | -53.9% 95% CI [-53.9, -53.9] | ✅ improvement |
Cross-framework reference (same StocksGrid workload)
| Metric | vanilla WinUI3¹ | Rust windows-reactor² |
Reactor (this PR) |
|---|---|---|---|
| Renders/sec ↑ | 3.18 | 4.75 | 2.59 |
| Avg Reconcile (ms) ↓ | n/a | 19.4 | 137.7 |
| Avg Diff (ms) ↓ | n/a | 17.7 | 125.3 |
| Avg Memory (MB) ↓ | 264.2 | 196.7 | 288.6 |
↑ higher is better · ↓ lower is better. Within noise = the 95% confidence interval of the paired Δ includes 0 (no change resolvable at this sample size); ✅ improvement /
Allocation metrics (alloc bytes/render, Gen0 GC) are the sensitive signal for allocation-reduction work, where the mean-ms / memory figures are largely flat. They read n/a for a harness built from a revision that predates them (rebase the PR onto main to populate them).
Reconciler micro-benchmarks run PerfBench.ControlModel --variant Reactor (M1–M13) as a headless loop bracketed by per-thread alloc + GC counters — ns-resolution and free of WinUI render / working-set dilution, so they resolve Core/Reconciler allocation deltas the macro StocksGrid workload cannot. main and PR each link their own src/Reactor build and are rep-interleaved (a fresh alternated process per rep); Δ is the paired 95% CI over per-rep means. The Status column tracks allocated bytes/op (deterministic for identical code); ns/op is informational — its paired CI is now unbiased but the flag stays dormant pending a real-CI identical-binary band calibration.
¹ vanilla WinUI3 = StressPerf.Direct (imperative; no virtual-DOM, so it has no reconcile/diff phase — those cells read n/a). Measured live on this runner.
² Rust = test_reactor_perf from microsoft/windows-rs — a port of this harness (same StocksGrid, same --percent/--duration CLI). Built from source and measured live on this runner.
Absolute numbers are runner-dependent — trust the Δ vs main, not the absolute values. Memory (working set) is the noisiest metric.
Runner: CPU: AMD EPYC 7763 64-Core Processor · 4 logical cores · 16 GB RAM · runner: GitHub Actions 1042925502.
Generated by .github/workflows/perf-compare.yml · PR 157ff02 vs main 52baebb · 2026-06-26T19:26:31Z · run log.
fce3d26 to
7fb4f2e
Compare
Reconciler.Update resolves an element's modifiers by accumulating any ModifiedElement wrapper layers, then merging the final inner element's own Modifiers. For the common case of an element that carries modifiers directly (no wrapper) -- every cell in a large keyed grid -- the accumulator is still referentially the element's own ElementModifiers, so that final merge was a self-merge x.Merge(x): it allocated a fresh, value-identical ElementModifiers plus its non-null Layout/Visual bucket sub-records (a Layout+Visual cell is parent + 2 buckets = 3 records, on each of the old and new sides => ~6) per changed cell, every render. On the StocksGrid workload (500 cells, ~50% mutation) that is ~3 KB/changed cell (~7 MB/render of pure garbage). Guard the final merge with !ReferenceEquals(accumulator, element.Modifiers): only merge when a wrapper layer actually contributed a distinct instance. When nothing wrapped the element, keep its own Modifiers reference as-is. Semantically identical (Merge(x,x) is value-equal to x); purely removes the allocation. ApplyModifiers behavior is unchanged. Scope note: this PR is P1 only. The originally-paired P2 (an ApplyModifiers fast-path that skips the post-update pass when modifiers compare equal) was dropped after review found Element.ModifiersEqual does not compare every field ApplyModifiers writes (RequestedTheme, Scale/Rotation/Translation/CenterPoint, inline-flow margin/padding/border, OnUnmountAction/OnUpdateAction hooks), so a structurally-equal compare can coexist with a changed transform/theme and the skip would leave the control stale. P2 will return as its own PR that first makes ModifiersEqual complete w.r.t. ApplyModifiers' guarded writes. Tests (tests/Reactor.Tests/ReconcilerModifierMergeTests.cs): - Merge(x,x) is value-equal to x but a distinct instance (the invariant the guard relies on). - A real wrapper+inner merge still combines correctly (inner wins, base fills gaps) -- the guard only skips the self case, never a real merge. - Revert->fail teeth: 50k Update calls on a direct-modifier leaf allocate ~0 B/call with the guard; reverting it allocates ~1.95 KB/call (cap 64 B). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
7fb4f2e to
ad4de25
Compare
|
/perf |
… preservation (P1 pr-review) Folds two test-coverage findings from the pr-review of #692 (both multi-model-confirmed), strengthening the modifier-merge teeth without touching production code: - Update_WrapperLayer_StillMergesInnerModifiers: proves the !ReferenceEquals guard skips ONLY the self-merge - when a ModifiedElement wrapper contributes a distinct modifier instance, Reconciler.Update''s fall-through still performs the real merge. Differential-allocation teeth verified to bite: simulating an over-broad guard makes this fail while the existing self-merge tooth still passes (unique coverage). - Merge_Preserves_Lifecycle_Callbacks: pins that ElementModifiers.Merge preserves OnMount/OnUnmount/OnUpdateAction (other wins, base fills gaps), guarding the same callback-drop class that bit a sibling change. Full Reactor.Tests green (9707 passed / 64 skipped / 0 failed). Test-only; no src/Reactor change, so the P1 perf delta on ad4de25 is unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/perf Acceptance-demo run: end-to-end validation of the complete merged harness (alloc metric + ns micro-suite + low-mutation skip-floor + keyed-list leg) on real CI-built exes. #692 is the self-merge-guard alloc fix, so the StocksGrid allocation headline is the story; keyed/skip-floor legs are expected within-noise (this PR doesn't target those paths). Baseline = current main (41e41d7, production-identical — all four harness merges were 0-src). — PERFVAL harness session |
|
/perf Re-fire post-#700-merge (budget-fit |
|
/perf |
… P4) (#699) * perf(reconciler): structural-skip untouched child ranges (positional, P4) Make `ChildReconciler.ReconcilePositional` O(changed) instead of O(count) when a memoizing producer (`UseMemoCellsByIndex`) reuses untouched cells reference- equal. Targets the positional skip-walk FLOOR — the per-render O(count) cost of visiting every cell to confirm it can be skipped — which dominates low-mutation renders of large keyed grids (e.g. StocksGrid). Mechanism (CWT side-channel hint, mirrors #681's _dirtyAncestorPath bridge): • Producer: the `UseMemoCellsByIndex` reuse branch publishes a `ChildDiffHint` (ChangedIndices + ThemeSensitiveCount) keyed by reference on the fresh-per- render Element[]. No Element-record widening; AOT-safe (ConditionalWeakTable, no reflection). The theme count is carried forward incrementally so steady- state reuse stays O(changed); a one-time O(count) scan runs only on the first reuse after a full rebuild and as a defensive recompute. • Consumer: `ReconcilePositional` engages a fast path that updates ONLY the hinted changed indices and skips the rest, iff ALL hold: 1. old/new element counts match, 2. the live child collection equals that count (no in-flight anim inflated it), 3. no animation ambient, 4. a hint is present for THIS array (a CWT hit also proves Filter returned the same reference — no null/EmptyElement shifted the index space), 5. no cell is theme-sensitive (`!AnyThemeSensitive`), 6. the container is not on #681's dirty-ancestor path. Correctness: • Untouched indices are reference-equal BY CONSTRUCTION (the hook reuses prevChildren[i] for unchanged i and rebuilds only changedIndices). The changed and full-walk paths share a single `UpdateCommonChild` helper, so both honour identical skip / update / type-mismatch semantics. • The theme gate is the load-bearing safety property: the ONLY work the full walk does for an untouched cell that a structural skip would drop is re-resolving `ApplyThemeBindings` / `ApplyResourceOverrides` ThemeRefs against the effective theme (which a parent RequestedTheme toggle can change WITHOUT touching the element tree). Gating on the whole-array `AnyThemeSensitive` flag is provably safe and sidesteps the subtle dirty-path reasoning that bit P2. Tests: • Headless (Reactor.Tests): producer hint correctness incl. incremental theme- count carry + caller-mutation snapshot (UseMemoCellsTests); hint registry + IsThemeSensitive (ChildDiffHintsTests); consumer differential vs full walk incl. the gate teeth `ThemeSensitive_Hint_Forces_Full_Walk` (revert the gate → fails), count-mismatch, defensive OOB, empty-changed (ChildReconcilerStructuralSkipTests). • Live selftests (Reactor.AppTests.Host): LifecycleParity (OnUpdateAction fires for a changed index, never for untouched ref-equal — == full walk); ThemeRangeParity (themed ref-equal range under a RequestedTheme toggle renders + re-themes, no cell dropped). Per the empirical note below, the authoritative gate teeth is the headless visited-index assertion, not a live color delta. Empirical theme note: a LIVE color-delta teeth for the theme gate is impossible — WinUI auto-re-resolves a `{ThemeResource}` Style setter on any effective-theme change even when Reactor structurally skips the cell (verified: gate reverted → cells skipped, ApplyThemeBindings not re-run, yet brushes still went Light→Dark). The one snapshot a skip truly leaves stale (`ApplyResourceOverrides`' concrete ThemeRef.Resolve into fe.Resources) does not reliably re-resolve in the reconcile harness. The headless `ThemeSensitive_Hint_Forces_Full_Walk` is therefore the gate's load-bearing teeth; the live fixture is the end-to-end parity companion. Measurement: the win shows under a low-mutation skip-floor metric (PERFVAL's `--percent 0`); on the default 50%-mutation StocksGrid it is within noise. File-disjoint from the perf fleet (#692/#695 own Reconciler.Update.cs; this touches ChildReconciler.cs / ChildDiffHints.cs / UseMemoCells.cs + a parentControl thread- through in Reconciler.cs / V1HandlerAdapter.cs). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * perf(reconciler): fold pr-review findings into PR-C structural skip Address the internal pr-review skill + GitHub Copilot findings on the positional structural-skip fast path. No behavior change for the StocksGrid target workload (all new gates pass on it); each fold tightens correctness or documents an invariant. - Hot-reload safety (C1): add `!ForceFullRenderActive` gate so a hot-reload force pass never structurally skips an untouched wrapper cell (the dirty path is empty during a pure force pass, so the dirty-path gate alone did not cover it). Falls back to the full walk, which honours ForceRenderThroughWrapper per cell. - Array-identity guard (S1): the hint now carries a WeakReference to the exact previous-render array its ChangedIndices were diffed against; the fast path engages only when the reconciler's old array IS that array. A cheap, self-documenting sufficient condition for the per-index ref-equality invariant; any defensive copy upstream safely falls back to the full walk. Weak on purpose -- a strong ref would chain every historical array through the reference-keyed CWT and leak. - Duplicate-index hardening (dedupe): snapshot + sort/compact the caller's changedIndices before the theme tally / builder / publish. A duplicated themed->plain index could otherwise under-count the incremental theme-sensitive tally and wrongly publish AnyThemeSensitive=false, and would rebuild + re-update the same cell N times. - Dirty-path gate (T1): documented as conservative defense-in-depth. Proven by experiment that it is behaviorally redundant given the count/CWT/array-id gates (the full walk skips a ref-equal self-triggered cell identically via CanSkipUpdate), retained as cheap insurance; costs nothing on the target workload (cell panel is a descendant, not an ancestor, of the self-triggered grid component). - ResourceOverrides conservatism (C3): documented why the ThemeRef-backed ResourceOverrides arm of IsThemeSensitive is intentionally conservative. Tests: weak-ref round-trip + stale-old-array teeth (gate 8) + chained theme-count carry (T3) + duplicate-index theme-count/build-once (T4). Full Reactor.Tests green (9731); StructuralSkip selftests green; core lib Release AOT-clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * perf(reconciler): address Copilot review on PR-C structural skip Fold the three GitHub Copilot review findings on the positional structural-skip fast path. All are hardening; no behavior change on the StocksGrid target. - Null cells (findings 1+2): a cell builder may legitimately return null (ChildReconciler.Filter drops nulls downstream), but PR-C's theme tally now inspects prev/built cells via ChildDiffHints.IsThemeSensitive, which dereferenced element.ThemeBindings and would NRE on a null. Widen the predicate to accept Element? and treat null as non-theme-sensitive (a null has no bindings to re-resolve). Fixes all three call sites in UseMemoCellsByIndex (the O(count) CountThemeSensitive scan + both incremental tally reads) at the single chokepoint. - DebugElementsSkipped diagnostic (finding 3): the fast path adjusted the skipped-element counter by `common - changed.Length`, but the loop defensively ignores out-of-range hint indices, so the raw hint length over-counts visited work and the diagnostic could skew (or, with enough out-of-range indices, go negative). Track indices ACTUALLY visited and base the adjustment on that, making the counter match the full walk exactly. Tests: null-cell predicate guard + producer null-cell theme-scan teeth; the out-of-range consumer test now asserts the skipped-element count equals the full-walk total (4), which the old `common - changed.Length` undercounted. Full Reactor.Tests green (9733); core lib Release AOT-clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fold pr-review skill findings into PR-C structural skip Ran the internal pr-review skill on the PR-C HEAD (7 dimensions + a gpt-5.4 multi-model cross-check, a different model family). Fold the actionable findings: - H1 (test-coverage; multi-model CONFIRMED load-bearing): add a hot-reload gate teeth selftest. StructuralSkip_HotReloadWrapperReRender puts a WRAPPER cell (Component) inside a UseMemoCellsByIndex range whose body a simulated hot-reload edit changes, then drives a real force pass. The fast path's !ForceFullRenderActive gate must defer to the full walk (which honours ForceRenderThroughWrapper per cell) so the wrapper re-renders its edited body. Teeth verified: reverting the gate fails WrapperReRenders + OldBodyGone (the structural skip swallows the edit). - H2 (test-coverage; partially-confirmed): add a headless differential test that mirrors the real producer's reference-equal reuse at untouched indices (not fresh copies) and asserts the fast-path output == full-walk output (identical skip accounting, no structural mutation, visited set a subset). - M1 (security + correctness; multi-model: real but not a ship-blocker, no cheap complete defense) + M3 (docs + api): document the returned array's immutability / no-mutation contract, the changedIndices dedupe contract, and the theme-sensitive fallback in the UseMemoCellsByIndex XML doc; hand-sync the generated reference MD. Dispositions recorded (no code change): - H3 / gate 6 (!IsOnDirtyAncestorPath): multi-model DISPUTED the test-coverage finding and independently confirmed the gate is behaviorally redundant given the count/CWT/array-id gates (a ref-equal untouched cell is skipped identically by the full walk via Element.CanSkipUpdate before dirty-path logic is consulted). No behavioral teeth is constructible; kept as documented cheap defense-in-depth. - M2: ThemeRangeParity already documents itself in-code as a smoke/parity check, not the gate teeth; the authoritative !AnyThemeSensitive teeth is the headless ChildReconcilerStructuralSkipTests.ThemeSensitive_Hint_Forces_Full_Walk. - L1: the ResourceOverrides arm of IsThemeSensitive is intentionally conservative (already documented) per the verified theme crux. Gates: core lib Release AOT 0W/0E; Reactor.Tests 9734 pass / 0 fail; StructuralSkip selftests 3 fixtures / 14 checks green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: pin structural-skip per-cell read elision as an allocation budget The existing ChildReconcilerStructuralSkipTests assert the fast path's VISIT COUNT (which child indices are read) but nothing pins the resulting allocation cut, so the measured StocksGrid allocation win (#699) could be silently reverted with every behavioural test still green. Add Structural_Skip_Pins_PerCell_Read_Elision_As_Allocation_Budget: a MeasuringChildCollection charges a fixed managed allocation per Get(i), modeling the per-cell COM read / marshaling the skip elides for untouched reference-equal cells (the real cost is native and unmeasurable headless). Fast path (hint published) allocates O(changed); full walk (no hint) allocates O(count). Asserts the mechanism (5 vs 500 reads/iter) and an 8x GC-bytes budget. Has teeth: disabling the fast-path gate makes the hinted path walk every cell, collapsing fast onto full and failing the test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * harden ChildDiffHint.AnyThemeSensitive to fail-safe (!= 0 not > 0) Copilot review on #699 flagged that AnyThemeSensitive derives from ThemeSensitiveCount > 0, so a hypothetical negative count would read as NOT theme-sensitive and could allow the structural-skip fast path to skip a theme-sensitive subtree (a missed-update risk). The only producer (UseMemoCellsByIndex) already clamps its incremental tally to a >= 0 floor before publishing (UseMemoCells.cs:299-300) and CountThemeSensitive only counts upward, so a negative is unreachable and > 0 is correct today. But this is the SAFETY gate for a correctness- sensitive skip, so harden the type to be fail-safe regardless: test != 0 rather than > 0. Behavior is byte-identical for every value the producer can emit (all >= 0); the only difference is that an anomalous negative now BLOCKS the skip (forces the always-correct full walk) instead of silently allowing it — the correct fail direction for a correctness gate. Provably perf-neutral: the StocksGrid workload publishes count == 0 every render, where both > 0 and != 0 yield false identically, so the fast path engages unchanged. Adds a fail-safe teeth test that goes red if the guard is reverted to > 0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: azchohfi <azchohfi@users.noreply.github.com>
What
Reconciler.Updateresolves an element's modifiers by accumulating anyModifiedElementwrapper layers, then merging the final inner element's ownModifiers. For the common case of an element that carries modifiers directly (no wrapper) — every cell in a large keyed grid — the accumulator is still referentially the element's ownElementModifiers, so that final merge was a self-mergex.Merge(x): it allocated a fresh, value-identicalElementModifiersplus itsLayout/Visual/Textsub-records (~6 records) per side, per changed cell, every render.On the StocksGrid workload (500 cells, ~50% mutation) that is ~3 KB/changed cell (~7 MB/render of pure garbage) — the single largest allocation lever found in the Phase-1 reconciler profile (hotspot H1).
Change
Guard the final merge with
!ReferenceEquals(accumulator, element.Modifiers): only merge when a wrapper layer actually contributed a distinct instance. When nothing wrapped the element, keep its ownModifiersreference as-is.Merge(x, x)is value-equal tox.ApplyModifiersbehavior is unchanged (it runs exactly as onmain).Scope note (P1 only)
This PR is P1 only. The originally-paired P2 — an
ApplyModifiersfast-path that skipped the post-update pass when modifiers compared equal — was dropped after review.Element.ModifiersEqualdoes not compare every fieldApplyModifierswrites (RequestedTheme,Scale/Rotation/Translation/CenterPoint, inline-flow margin/padding/border, and theOnUnmountAction/OnUpdateActionside-effect hooks), so a structurally-equal compare can coexist with a changed transform/theme and the skip would leave the control stale. P2 will return as its own follow-up PR that first makesModifiersEqualcomplete w.r.t.ApplyModifiers' guarded writes.Tests
tests/Reactor.Tests/ReconcilerModifierMergeTests.cs:Merge(x, x)is value-equal toxbut a distinct instance (what makes skipping the self-merge safe).Updatecalls on a direct-modifier leaf allocate ~0 B/call with the guard; reverting it allocates ~1.95 KB/call (cap 64 B).Full
dotnet test tests/Reactor.Testsgreen (9705 passed / 0 failed / 64 skipped). Core libdotnet build src/Reactor/Reactor.csproj -c ReleaseAOT-clean (0 warnings / 0 errors).