perf: coalesce COM-interop reads + allocations in reconcile path#666
perf: coalesce COM-interop reads + allocations in reconcile path#666azchohfi wants to merge 9 commits into
Conversation
Review-pass fixes for PR #666 (issue #654), all behavior-preserving and within the assigned reconcile-path files: - Context #25 (H1): CurrentValueEquals now reproduces the old object.Equals semantics EXACTLY — reference types use the virtual object.Equals (so an IEquatable<T> type without an object.Equals override keeps reference-equality change detection instead of collapsing distinct instances and skipping a required context rerender), value types keep the no-box EqualityComparer<T> path but guard the unbox with `is T` so a same-slot context-type swap returns false (as before) rather than throwing InvalidCastException. - #26 ApplyResourceOverrides (CO2): removal is driven by `managed` vs new overrides, not an old-vs-new ref comparison; the shallow-skip fast path calls this with old===new yet still must reconcile away stale managed keys. - #20 self-triggered set (CO3): add an Unmounted tombstone so a cached rerender closure firing after teardown can't resurrect the node into _selfTriggeredNodes (which the dirty-path pass would never clear again). - #16/#17 unmount tag re-read (CO4): the V1/type unmount dispatch re-reads the attached tag AFTER the user .OnUnmount/RunCleanups callbacks (which may detach state via the public APIs), matching the pre-coalesce per-site reads; pre-callback branches still share the single top-of-method read. - #23 ClearThemeBindings (AS1): clear the companion effective-theme marker DP alongside the style marker so the pair stays consistent across pool/reuse. Tests (tests/Reactor.Tests/ReconcilerInteropPerfTests.cs): - #15 rerender-delegate caching: identity reuse vs source-identity recreate; invocation marks SelfTriggered and bubbles to the source; post-unmount tombstone refuses a late self-trigger (#20). - #25 (H1) reference-IEquatable keeps reference-equality; value-type type mismatch returns false without throwing. - Scanner output parity pinned to exact literal messages (#62-#66). GetOrCreateComponentRerender made internal (InternalsVisibleTo Reactor.Tests) as a test seam. Validation: dotnet build src/Reactor -c Release 0/0; tests 9687 passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR targets reconciler hot-path performance by reducing redundant WinRT/COM boundary crossings and avoiding per-element/per-frame allocations, while adding unit tests to pin behavior parity for these optimizations.
Changes:
- Coalesces repeated COM/RCW reads/casts in update/unmount paths and introduces scratch-buffer reuse (lists, thread-static buffers) to cut allocations.
- Adds self-triggered component tracking + cached rerender delegate to avoid per-frame closures and reduce dirty-path scanning work.
- Refactors the debug accessibility scanner walk to reuse computed children/sibling snapshots and remove LINQ-heavy per-finding allocations; adds parity tests.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/Reactor.Tests/ReconcilerInteropPerfTests.cs | New unit tests pinning behavior parity for cache-key building, context compare semantics, context push overload, scanner output, and rerender delegate caching. |
| src/Reactor/Core/Reconciler.Update.cs | Coalesces FrameworkElement casts and reuses a realized-container scratch list to reduce per-refresh allocations. |
| src/Reactor/Core/Reconciler.Mount.cs | Prefilters layout footgun inspection to grids and switches component mount to cached rerender delegate creation. |
| src/Reactor/Core/Reconciler.KeyedItemsBinding.cs | Reuses ClosureItemViewSource when items/builder identities are unchanged to skip allocation + attached-DP write. |
| src/Reactor/Core/Reconciler.cs | Adds multiple hot-path optimizations: thread-static cache-key builder, self-triggered tracking set, unmount scratch list reuse, modifier clearing via shared empty, resource override scan without ToHashSet/ToArray, stagger key reuse, and themed style guard. |
| src/Reactor/Core/ContextScope.cs | Adds single-pair Push overload to avoid one-entry dictionary allocations. |
| src/Reactor/Core/Context.cs | Introduces typed context current-vs-last comparison helper to avoid boxing in memo/context-change detection. |
| src/Reactor/Core/AccessibilityScanner.cs | Refactors tree walk to compute children once, caches sibling-text arrays per frame, replaces LINQ with pooled/scratch buffers, and uses pooled sorting for tab-index gap checks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Addresses a Copilot review comment on #666: a cached rerender closure (perf #15) handed to a long-lived or cross-thread subscriber can outlive unmount and keep its ComponentNode alive. Drop ComponentNode.Control in ClearSelfTriggered(unmounting:true) so a leaked closure can't also pin the unmounted visual subtree. The dirty-path pass (perf #20) is the only reader of Control and never visits an unmounted node, so this is observably a no-op — it only releases the subtree for GC, reinforcing the #20 tombstone. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/perf |
⚡ Reactor perf comparisonWorkload: Regression vs
|
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Renders/sec ↑ | 2.55 | 2.52 | +2.7% 95% CI [-4.2, +9.6] | ≈ within noise |
| Avg Reconcile (ms) ↓ | 135.9 | 135.2 | -2.7% 95% CI [-7.1, +1.7] | ≈ within noise |
| Avg Diff (ms) ↓ | 122.0 | 121.7 | -2.5% 95% CI [-7.2, +2.2] | ≈ within noise |
| Avg Memory (MB) ↓ | 286.1 | 285.8 | 0.0% 95% CI [-1.2, +1.2] | ≈ within noise |
Low-mutation skip-floor (--percent 0)
At --percent 0 the workload mutates few cells per tick (always at least one), so reconcile/diff isolate the O(n) per-tick child skip-walk floor that higher mutation rates dilute — ChildReconciler re-walks every child each tick even when nothing moved. The closer --percent is to 0, the more this floor is the signal, so a structural-skip optimization shows up cleanly where the headline table above buries it. Δ is the mean paired change with a 95% CI.
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Renders/sec ↑ | 16.45 | 16.30 | -0.6% 95% CI [-7.0, +5.9] | ≈ within noise |
| Avg Reconcile (ms) ↓ | 35.4 | 36.2 | +3.9% 95% CI [-3.7, +11.5] | ≈ within noise |
| Avg Diff (ms) ↓ | 33.4 | 34.1 | +3.9% 95% CI [-4.3, +12.1] | ≈ within noise |
| Avg Memory (MB) ↓ | 268.4 | 267.9 | 0.0% 95% CI [-0.5, +0.5] | ≈ within noise |
Allocation (Reactor) — lower is better
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Alloc bytes/render ↓ | 5795645 | 5772253 | -0.8% 95% CI [-2.4, +0.7] | ≈ within noise |
| Gen0 GC / 1k renders ↓ | 241.67 | 230.77 | -4.5% 95% CI [-17.4, +8.4] | ≈ within noise |
Keyed-list workload (StressPerf.KeyedList, --percent 50)
A separate macro workload: a ~500-row stably keyed list whose rows are reordered / inserted / removed each tick. Because every child carries a key, the child reconciler takes its keyed arm (ReconcileKeyed → ReconcileKeyedMiddle, the LIS-based minimal-move pass) instead of the positional re-walk the StocksGrid tables above measure — so this is the sensitive macro signal for keyed-diff work the positional cells can never reach. Same interleaved paired-Δ 95% CI as the headline table.
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Renders/sec ↑ | 20.67 | 20.55 | -0.1% 95% CI [-1.6, +1.4] | ≈ within noise |
| Avg Reconcile (ms) ↓ | 15.9 | 16.0 | 0.0% 95% CI [-1.6, +1.5] | ≈ within noise |
| Avg Diff (ms) ↓ | 15.8 | 15.7 | -0.2% 95% CI [-1.7, +1.3] | ≈ within noise |
| Avg Memory (MB) ↓ | 168.5 | 169.0 | +0.1% 95% CI [-0.4, +0.6] | ≈ within noise |
Allocation (keyed-list) — lower is better
| Metric | main (baseline) |
This PR | Δ (95% CI) | Status |
|---|---|---|---|---|
| Alloc bytes/render ↓ | 313716 | 307439 | -1.9% 95% CI [-2.3, -1.6] | ✅ improvement |
| Gen0 GC / 1k renders ↓ | 18.61 | 14.49 | -10.0% 95% CI [-20.6, +0.7] | ≈ within noise |
Reconciler micro-benchmarks (PerfBench.ControlModel)
Production --variant Reactor control-model path, ns-resolution and WinUI-undiluted (spec-047 M1–M13) — ↓ lower is better. Status tracks allocated bytes/op, the authoritative signal here; it is deterministic for structurally-fixed benches, while dispatcher / background-thread benches carry a small process-to-process offset, so a bench is flagged only when its 95% CI clears a ±3% minimum-effect band (real structural alloc changes are several percent to many-x). ns/op is shown for context but is not auto-flagged (its paired CI is rep-interleaved but the flag remains dormant pending a real-CI identical-binary band calibration). Δ is the mean paired change with a 95% CI.
| Bench | main ns/op |
Δ ns (95% CI) | main B/op |
Δ alloc (95% CI) | Status |
|---|---|---|---|---|---|
M1 Mount_Leaf_NoCallback |
149377.3 | +0.1% 95% CI [-1.1, +1.3] | 1140.9 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M2 Mount_Leaf_OneCallback |
110489.9 | +0.6% 95% CI [-6.4, +7.6] | 3383.3 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M3 Mount_Leaf_ThreeCallbacks |
220396.5 | -0.6% 95% CI [-5.6, +4.4] | 8438.9 | -0.6% 95% CI [-3.2, +2.0] | ≈ within noise |
M4 Dispatch_Switch_Cold |
105685.8 | +1.8% 95% CI [-2.8, +6.5] | 1767.8 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M5 Dispatch_Switch_Warm |
107225.2 | +2.1% 95% CI [-3.6, +7.8] | 1766.0 | 0.0% 95% CI [-1.7, +1.8] | ≈ within noise |
M6 Dispatch_ExternalType |
91270.2 | +0.2% 95% CI [-0.4, +0.8] | 987.6 | +0.1% 95% CI [-3.0, +3.2] | ≈ within noise |
M7 Update_NoChange |
55354.1 | +1.0% 95% CI [+0.1, +1.8] | 452.1 | +2.2% 95% CI [-4.6, +9.0] | ≈ within noise |
M8 Update_OneLeafChanged |
41823.2 | +1.4% 95% CI [+0.3, +2.5] | 536.0 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M9 Update_AllChanged |
2856759.2 | +1.3% 95% CI [-0.7, +3.2] | 184278.1 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M10 EventHandlerState_Alloc |
85021.8 | +2.0% 95% CI [-0.8, +4.8] | 3095.2 | 0.0% 95% CI [-0.1, 0.0] | ≈ within noise |
M11 ModifierEHS_Frequency |
46527.3 | +3.4% 95% CI [-1.0, +7.8] | 638.9 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M12 Pool_Rent_HotPath |
117414.9 | +1.6% 95% CI [-1.5, +4.8] | 1099.9 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M13 Setters_Suppression_Scope |
111.2 | -4.5% 95% CI [-14.4, +5.4] | 26.7 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
M14 Dsl_Rebuild_Cascade |
1565757.7 | -15.8% 95% CI [-19.3, -12.4] | 2231828.9 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
C207 ChangeHandler_DpRead_Coalesce |
1295.6 | -0.3% 95% CI [-4.6, +4.1] | 0.6 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
OAlloc Optional_Element_Alloc |
211.1 | +9.4% 95% CI [-4.6, +23.3] | 528.0 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
OUpdate Optional_Reconciler_Update |
12396.8 | +0.3% 95% CI [-1.8, +2.4] | 2772.3 | 0.0% 95% CI [0.0, 0.0] | ≈ within noise |
Cross-framework reference (same StocksGrid workload)
| Metric | vanilla WinUI3¹ | Rust windows-reactor² |
Reactor (this PR) |
|---|---|---|---|
| Renders/sec ↑ | 3.33 | 4.75 | 2.52 |
| Avg Reconcile (ms) ↓ | n/a | 19.7 | 135.2 |
| Avg Diff (ms) ↓ | n/a | 17.8 | 121.7 |
| Avg Memory (MB) ↓ | 263.9 | 196.4 | 285.8 |
↑ higher is better · ↓ lower is better. Within noise = the 95% confidence interval of the paired Δ includes 0 (no change resolvable at this sample size); ✅ improvement /
Allocation metrics (alloc bytes/render, Gen0 GC) are the sensitive signal for allocation-reduction work, where the mean-ms / memory figures are largely flat. They read n/a for a harness built from a revision that predates them (rebase the PR onto main to populate them).
Reconciler micro-benchmarks run PerfBench.ControlModel --variant Reactor (M1–M13) as a headless loop bracketed by per-thread alloc + GC counters — ns-resolution and free of WinUI render / working-set dilution, so they resolve Core/Reconciler allocation deltas the macro StocksGrid workload cannot. main and PR each link their own src/Reactor build and are rep-interleaved (a fresh alternated process per rep); Δ is the paired 95% CI over per-rep means. The Status column tracks allocated bytes/op (deterministic for identical code); ns/op is informational — its paired CI is now unbiased but the flag stays dormant pending a real-CI identical-binary band calibration.
¹ vanilla WinUI3 = StressPerf.Direct (imperative; no virtual-DOM, so it has no reconcile/diff phase — those cells read n/a). Measured live on this runner.
² Rust = test_reactor_perf from microsoft/windows-rs — a port of this harness (same StocksGrid, same --percent/--duration CLI). Built from source and measured live on this runner.
Absolute numbers are runner-dependent — trust the Δ vs main, not the absolute values. Memory (working set) is the noisiest metric.
Runner: CPU: AMD EPYC 7763 64-Core Processor · 4 logical cores · 16 GB RAM · runner: GitHub Actions 1043006547.
Generated by .github/workflows/perf-compare.yml · PR 5cc6cbc vs main e8572a0 · 2026-06-27T07:29:42Z · run log.
Reduces redundant WinRT/COM-RCW boundary crossings and per-element allocations on the diff/reconcile/patch hot path (data-grid stress workload, StocksGrid). Targets the dominant Reconcile gap. Reconciler.cs / Update.cs / Mount.cs: - #16 UnmountRecursive: read ReactorAttached.State once (was 5x GetValue). - #17 UnmountAndCollect: read element tag once (was 5x GetElementTag). - #14 Update: coalesce repeated `is FrameworkElement` RCW type-tests to one. - #15 Cache the component rerender delegate on ComponentNode; recreate only when the requestRerender identity changes. - #18 UpdateDefaultAutomationName: ref/ordinal short-circuit before the AutomationProperties.GetName COM read. - #19 ApplyModifiers accepts null = clear-all (no fresh ElementModifiers). - #20 PopulateDirtyAncestorPath iterates a tracked self-triggered set instead of the whole _componentNodes dict. - #21 UnmountAndPool reuses a borrowed scratch list (reentrant calls fall back to a fresh list). - #22 RefreshRealizedContainers reuses a cleared field list. - #23 ApplyStyleToElement ref-equal + ActualTheme guard; ApplyThemeBindings gated on the ShallowEquals fast path. - #24 BuildCacheKey: thread-static StringBuilder + pair buffer, single pass. - #25 HasConsumedContextChanged compares via Context<T>/EqualityComparer<T> (no boxing of value-type context). - #26 ApplyResourceOverrides: no ToHashSet/ToArray, ref-equal fast path. - #27 WireReferenceListEdge: static loop, no per-edge List/closures. - #28 ApplyStaggerDelays: hoisted static animation-key array. - #29 PushContextDisposable: single-pair push overload, pooled PopOnDispose. Reconciler.KeyedItemsBinding.cs: - #39 ref-equality cache; only realloc + SetItemViewSource on change. Reconciler.Mount.cs: - #40 LayoutFootgunDetector gated by GridElement pre-filter. AccessibilityScanner.cs (debug-only scan path): - #62 compute GetChildren once in Walk, pass into Push. - #63 cache sibling-text array in ScanContext, invalidate on Push. - #64 BuildContext single-pass with pooled scratch (no LINQ chains). - #65 IsLikelyEmoji explicit span loop (no LINQ). - #66 CheckTabIndexGaps sorts a rented buffer (no OrderBy/ToList). Behavior preserved exactly: coalesced COM reads observe the same values at the same times; unmount/cleanup ordering unchanged; scanner emits identical findings. AOT/trim clean (no new reflection). Adds tests/Reactor.Tests/ReconcilerInteropPerfTests.cs (11 tests: rerender-delegate caching, context-change compare, cache-key reuse, single-pair context push, scanner output parity). Closes #654 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Review-pass fixes for PR #666 (issue #654), all behavior-preserving and within the assigned reconcile-path files: - Context #25 (H1): CurrentValueEquals now reproduces the old object.Equals semantics EXACTLY — reference types use the virtual object.Equals (so an IEquatable<T> type without an object.Equals override keeps reference-equality change detection instead of collapsing distinct instances and skipping a required context rerender), value types keep the no-box EqualityComparer<T> path but guard the unbox with `is T` so a same-slot context-type swap returns false (as before) rather than throwing InvalidCastException. - #26 ApplyResourceOverrides (CO2): removal is driven by `managed` vs new overrides, not an old-vs-new ref comparison; the shallow-skip fast path calls this with old===new yet still must reconcile away stale managed keys. - #20 self-triggered set (CO3): add an Unmounted tombstone so a cached rerender closure firing after teardown can't resurrect the node into _selfTriggeredNodes (which the dirty-path pass would never clear again). - #16/#17 unmount tag re-read (CO4): the V1/type unmount dispatch re-reads the attached tag AFTER the user .OnUnmount/RunCleanups callbacks (which may detach state via the public APIs), matching the pre-coalesce per-site reads; pre-callback branches still share the single top-of-method read. - #23 ClearThemeBindings (AS1): clear the companion effective-theme marker DP alongside the style marker so the pair stays consistent across pool/reuse. Tests (tests/Reactor.Tests/ReconcilerInteropPerfTests.cs): - #15 rerender-delegate caching: identity reuse vs source-identity recreate; invocation marks SelfTriggered and bubbles to the source; post-unmount tombstone refuses a late self-trigger (#20). - #25 (H1) reference-IEquatable keeps reference-equality; value-type type mismatch returns false without throwing. - Scanner output parity pinned to exact literal messages (#62-#66). GetOrCreateComponentRerender made internal (InternalsVisibleTo Reactor.Tests) as a test seam. Validation: dotnet build src/Reactor -c Release 0/0; tests 9687 passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Guards the boxed-Nullable<T> edge raised in Copilot review: a non-null int? boxes to a boxed underlying int, and the value-type `is T` guard (T = int?) still matches it, so an unchanged nullable context value is not spuriously treated as changed. Proves the H1 hybrid CurrentValueEquals preserves the old object.Equals nullable semantics against production code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds Assert.NotNull(wrapped) before invoking the cached rerender delegate in the two CachedRerender_* tests. Asserts the GetOrCreateComponentRerender contract (never returns null) and resolves a CodeQL "useless assignment" false positive (its dataflow did not model the subsequent delegate invocation as a read). Behavior-neutral test hardening. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Review-pass fixes from the pr-review skill (correctness + alternative-solution + test-coverage dimensions) and the multi-model cross-check, all within the existing reconcile interop scope. No behavior change: - Context<T>.CurrentValueEquals (CO-H1, high): the value-type arm routed through EqualityComparer<T>.Default, which dispatches to IEquatable<T> and diverges from the original object.Equals for structs whose IEquatable<T>.Equals disagrees with their object.Equals override. Unify both arms to current.Equals(lastBoxed) — for two non-null operands that is exactly object.Equals — and keep the reference-type ReferenceEquals short-circuit so same-instance / non-reflexive cases stay exact. Still avoids boxing the freshly-read value for primitives/Nullable<T>. Drops the now-unused System.Collections.Generic using. - BuildCacheKey (AS-L1): release the thread-static scratch pair array when an outlier element grows it past a small cap, so it is not pinned for the thread's lifetime. - RefreshRealizedContainers (AS-L2): only retain the realized-scratch list when its capacity stays modest; an oversized one-off refresh is collected, not pinned. - Tests: assert scanner emission ORDER (the parity test sorts first), plus value-type IEquatable-vs-object.Equals divergence and the reference-equality fast-path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses a Copilot review comment on #666: a cached rerender closure (perf #15) handed to a long-lived or cross-thread subscriber can outlive unmount and keep its ComponentNode alive. Drop ComponentNode.Control in ClearSelfTriggered(unmounting:true) so a leaked closure can't also pin the unmounted visual subtree. The dirty-path pass (perf #20) is the only reader of Control and never visits an unmounted node, so this is observably a no-op — it only releases the subtree for GC, reinforcing the #20 tombstone. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eview) Addresses three Copilot re-review comments on the interop-perf PR, all the same scratch-hygiene pattern already applied to t_cacheKeyPairs (#24) and _realizedScratch (#22): a one-off outlier input can leave a thread-static / instance scratch buffer holding an oversized backing array (or stale element refs) for the rest of the thread/reconciler lifetime. Behavior-neutral — the computed outputs are unchanged, so existing parity tests cover them. - AccessibilityScanner.BuildContext: wrap the t_buildCtxChildren populate/use in try/finally so a mid-population exception can't leave element refs pinned on the thread-static list until the next call. Success-path timing of the clear is unchanged. - BuildCacheKey: mirror the pairs-array retain-cap for the thread-static StringBuilder (t_cacheKeySb) — release it after a one-off unusually large key instead of pinning the grown backing buffer. - UnmountAndPool: mirror the _realizedScratch capacity cap for _unmountPoolScratch — only retain the scratch list when it's still modestly sized, so a one-off large-subtree unmount doesn't pin an oversized array. Validation: dotnet build src/Reactor -c Release -> 0 warnings / 0 errors; dotnet test tests/Reactor.Tests -> 9691 passed / 0 failed / 64 skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…heck) A multi-model (gpt-5.4) cross-check of the four reconcile/scan scratch retain-caps flagged the BuildCacheKey site: at the degenerate combined-cap case (an element with >64 theme bindings AND a >1024-char key rendered every frame), discarding the thread-static buffers reallocates a KeyValuePair[] (2x the size of the pre-#24 baseline's string[]) plus a fresh StringBuilder — i.e. WORSE than origin/main, which the perf PR must never be. The size-cap's only job was bounding memory retention, but Array.Clear(pairs) / sb.Clear() already drop the referenced strings/ThemeRefs, so nothing is pinned regardless. Removing the caps makes BuildCacheKey unconditional- retain, which is strictly cheaper than baseline in every case (the dominant StringBuilder backing buffer is reused, not realloc'd) and provably never worse. The other two caps are unchanged — the cross-check CONFIRMED them: _realizedScratch and _unmountPoolScratch discard to EXACTLY the origin/main allocation shape (List<UIElement> / List<FrameworkElement>), so they cannot be worse than baseline, and both sit well above the 50-row StocksGrid steady-state high-water (~64 <= 256) so they never trim-and-regrow on the stress workload. Behavior-neutral: BuildCacheKey return values are unchanged (only buffer-retention differs), so existing parity tests cover it. Validation: dotnet build src/Reactor -c Release -> 0 warnings / 0 errors; dotnet test tests/Reactor.Tests -> 9691 passed / 0 failed / 64 skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/perf |
405f758 to
b33b5cd
Compare
|
/perf |
|
/perf |
|
Closing after re-validation on current main (post-#692 + #665). Confirmed an honest no-op: the COM-interop coalescing this targets is below measurement resolution on the available workloads, and — notably — the earlier skip-floor |
|
/perf |
1 similar comment
|
/perf |
Closes #654
What
Reduces redundant WinRT/COM-RCW boundary crossings and per-element allocations on the diff → reconcile → patch hot path (data-grid stress workload,
StressPerf.ReactorOptimizedStocksGrid). This directly targets the dominant Reconcile gap (~46 ms C# vs ~7.9 ms Rust).Each WinRT control access (GetValue on attached DPs, RCW type-tests, AutomationProperties reads) crosses a COM boundary; the reconciler previously repeated these several times per element. These changes read each value once and reuse it, and remove per-element allocations.
Changes
Reconciler.cs/Reconciler.Update.cs/Reconciler.Mount.csUnmountRecursive: readReactorAttached.Stateonce (was 5×GetValue).UnmountAndCollect: read the element tag once (was 5×GetElementTag).is FrameworkElementRCW type-tests into onetarget as FrameworkElement.ComponentNode; recreate only when therequestRerenderidentity changes (was a fresh closure perReconcileComponent).UpdateDefaultAutomationName: ref/ordinal short-circuit before theAutomationProperties.GetNameCOM read; skip for element types that never caption.ApplyModifiersacceptsnull= clear-all (no freshElementModifiersper clear update).PopulateDirtyAncestorPathiterates a tracked self-triggered set instead of the whole_componentNodesdict.UnmountAndPoolreuses a borrowed scratch list (reentrant calls fall back to a fresh list).RefreshRealizedContainersreuses a cleared field list (wasnew List<UIElement>(count)per call).ApplyStyleToElementref-equal +ActualThemeguard;ApplyThemeBindingsgated on theShallowEqualsfast path. The null-then-set dance is preserved when the effective theme actually changes (load-bearing for{ThemeResource}re-resolution).BuildCacheKey: thread-staticStringBuilder+ pair buffer, single pass (wasKeys.ToArray()+ newStringBuilder+ double dict lookup).HasConsumedContextChangedcompares viaContext<T>/EqualityComparer<T>.Default(no boxing of value-type context).ApplyResourceOverrides: noToHashSet/ToArray, ref-equal fast path, linear scan of small sets.WireReferenceListEdge: static loop with a local ref (no per-edgeList+Anyclosures).ApplyStaggerDelays: hoisted static animation-key array (wasnew string[6]per staggered child).PushContextDisposable: single-pair push overload, pooledPopOnDispose.Reconciler.KeyedItemsBinding.csClosureItemViewSource+SetItemViewSource(COM) on change.Reconciler.Mount.csLayoutFootgunDetector.Inspectgated behind aGridElementpre-filter (no-op for non-grids; flag short-circuit preserved in Release).AccessibilityScanner.cs(debug-only / opt-in scan path)GetChildrenonce inWalk, pass intoPush(was two full switch dispatches per element).ScanContext, invalidate onPush(wasToArray()at 5 emit sites).BuildContextsingle-pass with pooled scratch (was 3 LINQ chains per finding).IsLikelyEmojiexplicit span loop (wastext.Any/AllLINQ per Button label).CheckTabIndexGapssorts a rented buffer (wasOrderBy+ToList).Behavior preservation
Behavior is preserved exactly: coalesced COM reads observe the same values at the same times, unmount/cleanup ordering is unchanged, and the accessibility scanner emits identical findings. No new reflection — the core lib stays AOT/trim-clean (warnings-as-errors).
Tests
Adds
tests/Reactor.Tests/ReconcilerInteropPerfTests.cs(11 tests): rerender-delegate caching/identity, context-change comparison (no boxing), cache-key reuse and format parity, single-pair context push, and scanner output parity (#62–#66).Validation
dotnet build src/Reactor/Reactor.csproj -c Release→ clean (0 warnings / 0 errors).dotnet test tests/Reactor.Tests→ full suite green (9681 passed, pre-existing skips only); the 11 new tests pass.LT_OnMountUnmountBalancedis a pre-existing async-WaitForflake — it fails more often on unmodifiedmainthan with this change.)Perf impact validates via
/perfonce that workflow merges.