Skip to content

perf: coalesce COM-interop reads + allocations in reconcile path#666

Closed
azchohfi wants to merge 9 commits into
mainfrom
azchohfi-reconciler-interop-perf
Closed

perf: coalesce COM-interop reads + allocations in reconcile path#666
azchohfi wants to merge 9 commits into
mainfrom
azchohfi-reconciler-interop-perf

Conversation

@azchohfi

Copy link
Copy Markdown
Collaborator

Closes #654

What

Reduces redundant WinRT/COM-RCW boundary crossings and per-element allocations on the diff → reconcile → patch hot path (data-grid stress workload, StressPerf.ReactorOptimized StocksGrid). This directly targets the dominant Reconcile gap (~46 ms C# vs ~7.9 ms Rust).

Each WinRT control access (GetValue on attached DPs, RCW type-tests, AutomationProperties reads) crosses a COM boundary; the reconciler previously repeated these several times per element. These changes read each value once and reuse it, and remove per-element allocations.

Changes

Reconciler.cs / Reconciler.Update.cs / Reconciler.Mount.cs

Reconciler.KeyedItemsBinding.cs

Reconciler.Mount.cs

AccessibilityScanner.cs (debug-only / opt-in scan path)

#67 intentionally not taken — premise didn't hold: scanner findings are always emitted (no ignored path) and A11yDiagnostic.Message is a public required serialized property, so lazy computation saves nothing and would risk the record contract.

Behavior preservation

Behavior is preserved exactly: coalesced COM reads observe the same values at the same times, unmount/cleanup ordering is unchanged, and the accessibility scanner emits identical findings. No new reflection — the core lib stays AOT/trim-clean (warnings-as-errors).

Tests

Adds tests/Reactor.Tests/ReconcilerInteropPerfTests.cs (11 tests): rerender-delegate caching/identity, context-change comparison (no boxing), cache-key reuse and format parity, single-pair context push, and scanner output parity (#62#66).

Validation

  • dotnet build src/Reactor/Reactor.csproj -c Release → clean (0 warnings / 0 errors).
  • dotnet test tests/Reactor.Tests → full suite green (9681 passed, pre-existing skips only); the 11 new tests pass.
  • Selftests for the affected mount/update/unmount/keyed/theme/style/resource/context/accessibility paths → all green. (LT_OnMountUnmountBalanced is a pre-existing async-WaitFor flake — it fails more often on unmodified main than with this change.)

Perf impact validates via /perf once that workflow merges.

@azchohfi azchohfi marked this pull request as draft June 25, 2026 19:41
Comment thread src/Reactor/Core/AccessibilityScanner.cs Fixed
azchohfi added a commit that referenced this pull request Jun 25, 2026
Review-pass fixes for PR #666 (issue #654), all behavior-preserving and
within the assigned reconcile-path files:

- Context #25 (H1): CurrentValueEquals now reproduces the old object.Equals
  semantics EXACTLY — reference types use the virtual object.Equals (so an
  IEquatable<T> type without an object.Equals override keeps reference-equality
  change detection instead of collapsing distinct instances and skipping a
  required context rerender), value types keep the no-box EqualityComparer<T>
  path but guard the unbox with `is T` so a same-slot context-type swap returns
  false (as before) rather than throwing InvalidCastException.
- #26 ApplyResourceOverrides (CO2): removal is driven by `managed` vs new
  overrides, not an old-vs-new ref comparison; the shallow-skip fast path calls
  this with old===new yet still must reconcile away stale managed keys.
- #20 self-triggered set (CO3): add an Unmounted tombstone so a cached rerender
  closure firing after teardown can't resurrect the node into _selfTriggeredNodes
  (which the dirty-path pass would never clear again).
- #16/#17 unmount tag re-read (CO4): the V1/type unmount dispatch re-reads the
  attached tag AFTER the user .OnUnmount/RunCleanups callbacks (which may detach
  state via the public APIs), matching the pre-coalesce per-site reads; pre-callback
  branches still share the single top-of-method read.
- #23 ClearThemeBindings (AS1): clear the companion effective-theme marker DP
  alongside the style marker so the pair stays consistent across pool/reuse.

Tests (tests/Reactor.Tests/ReconcilerInteropPerfTests.cs):
- #15 rerender-delegate caching: identity reuse vs source-identity recreate;
  invocation marks SelfTriggered and bubbles to the source; post-unmount tombstone
  refuses a late self-trigger (#20).
- #25 (H1) reference-IEquatable keeps reference-equality; value-type type mismatch
  returns false without throwing.
- Scanner output parity pinned to exact literal messages (#62-#66).

GetOrCreateComponentRerender made internal (InternalsVisibleTo Reactor.Tests) as a
test seam. Validation: dotnet build src/Reactor -c Release 0/0; tests 9687 passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azchohfi azchohfi requested a review from Copilot June 25, 2026 20:50

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets reconciler hot-path performance by reducing redundant WinRT/COM boundary crossings and avoiding per-element/per-frame allocations, while adding unit tests to pin behavior parity for these optimizations.

Changes:

  • Coalesces repeated COM/RCW reads/casts in update/unmount paths and introduces scratch-buffer reuse (lists, thread-static buffers) to cut allocations.
  • Adds self-triggered component tracking + cached rerender delegate to avoid per-frame closures and reduce dirty-path scanning work.
  • Refactors the debug accessibility scanner walk to reuse computed children/sibling snapshots and remove LINQ-heavy per-finding allocations; adds parity tests.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/Reactor.Tests/ReconcilerInteropPerfTests.cs New unit tests pinning behavior parity for cache-key building, context compare semantics, context push overload, scanner output, and rerender delegate caching.
src/Reactor/Core/Reconciler.Update.cs Coalesces FrameworkElement casts and reuses a realized-container scratch list to reduce per-refresh allocations.
src/Reactor/Core/Reconciler.Mount.cs Prefilters layout footgun inspection to grids and switches component mount to cached rerender delegate creation.
src/Reactor/Core/Reconciler.KeyedItemsBinding.cs Reuses ClosureItemViewSource when items/builder identities are unchanged to skip allocation + attached-DP write.
src/Reactor/Core/Reconciler.cs Adds multiple hot-path optimizations: thread-static cache-key builder, self-triggered tracking set, unmount scratch list reuse, modifier clearing via shared empty, resource override scan without ToHashSet/ToArray, stagger key reuse, and themed style guard.
src/Reactor/Core/ContextScope.cs Adds single-pair Push overload to avoid one-entry dictionary allocations.
src/Reactor/Core/Context.cs Introduces typed context current-vs-last comparison helper to avoid boxing in memo/context-change detection.
src/Reactor/Core/AccessibilityScanner.cs Refactors tree walk to compute children once, caches sibling-text arrays per frame, replaces LINQ with pooled/scratch buffers, and uses pooled sorting for tab-index gap checks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Reactor/Core/Context.cs Outdated
Comment thread src/Reactor/Core/Reconciler.cs
Comment thread src/Reactor/Core/Reconciler.cs Outdated
Comment thread tests/Reactor.Tests/ReconcilerInteropPerfTests.cs Fixed
Comment thread tests/Reactor.Tests/ReconcilerInteropPerfTests.cs Fixed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comment thread src/Reactor/Core/Reconciler.cs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comment thread tests/Reactor.Tests/ReconcilerInteropPerfTests.cs
azchohfi added a commit that referenced this pull request Jun 25, 2026
Addresses a Copilot review comment on #666: a cached rerender closure (perf #15)
handed to a long-lived or cross-thread subscriber can outlive unmount and keep its
ComponentNode alive. Drop ComponentNode.Control in ClearSelfTriggered(unmounting:true)
so a leaked closure can't also pin the unmounted visual subtree. The dirty-path pass
(perf #20) is the only reader of Control and never visits an unmounted node, so this is
observably a no-op — it only releases the subtree for GC, reinforcing the #20 tombstone.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azchohfi azchohfi requested a review from Copilot June 25, 2026 22:05

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comment thread src/Reactor/Core/AccessibilityScanner.cs Outdated
Comment thread src/Reactor/Core/Reconciler.cs
Comment thread src/Reactor/Core/Reconciler.cs
Comment thread tests/Reactor.Tests/ReconcilerInteropPerfTests.cs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

Comment thread src/Reactor/Core/AccessibilityScanner.cs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

@azchohfi

Copy link
Copy Markdown
Collaborator Author

/perf

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

⚡ Reactor perf comparison

Workload: StressPerf.ReactorOptimized StocksGrid · --percent 50 --duration 10 · x64 Release · median of 12 paired runs (2 warmup dropped); Δ is the mean change with a 95% CI · PR head and main built and run interleaved on the same runner.

Regression vs main baseline

Metric main (baseline) This PR Δ (95% CI) Status
Renders/sec ↑ 2.55 2.52 +2.7% 95% CI [-4.2, +9.6] ≈ within noise
Avg Reconcile (ms) ↓ 135.9 135.2 -2.7% 95% CI [-7.1, +1.7] ≈ within noise
Avg Diff (ms) ↓ 122.0 121.7 -2.5% 95% CI [-7.2, +2.2] ≈ within noise
Avg Memory (MB) ↓ 286.1 285.8 0.0% 95% CI [-1.2, +1.2] ≈ within noise

Low-mutation skip-floor (--percent 0)

At --percent 0 the workload mutates few cells per tick (always at least one), so reconcile/diff isolate the O(n) per-tick child skip-walk floor that higher mutation rates dilute — ChildReconciler re-walks every child each tick even when nothing moved. The closer --percent is to 0, the more this floor is the signal, so a structural-skip optimization shows up cleanly where the headline table above buries it. Δ is the mean paired change with a 95% CI.

Metric main (baseline) This PR Δ (95% CI) Status
Renders/sec ↑ 16.45 16.30 -0.6% 95% CI [-7.0, +5.9] ≈ within noise
Avg Reconcile (ms) ↓ 35.4 36.2 +3.9% 95% CI [-3.7, +11.5] ≈ within noise
Avg Diff (ms) ↓ 33.4 34.1 +3.9% 95% CI [-4.3, +12.1] ≈ within noise
Avg Memory (MB) ↓ 268.4 267.9 0.0% 95% CI [-0.5, +0.5] ≈ within noise

Allocation (Reactor) — lower is better

Metric main (baseline) This PR Δ (95% CI) Status
Alloc bytes/render ↓ 5795645 5772253 -0.8% 95% CI [-2.4, +0.7] ≈ within noise
Gen0 GC / 1k renders ↓ 241.67 230.77 -4.5% 95% CI [-17.4, +8.4] ≈ within noise

Keyed-list workload (StressPerf.KeyedList, --percent 50)

A separate macro workload: a ~500-row stably keyed list whose rows are reordered / inserted / removed each tick. Because every child carries a key, the child reconciler takes its keyed arm (ReconcileKeyedReconcileKeyedMiddle, the LIS-based minimal-move pass) instead of the positional re-walk the StocksGrid tables above measure — so this is the sensitive macro signal for keyed-diff work the positional cells can never reach. Same interleaved paired-Δ 95% CI as the headline table.

Metric main (baseline) This PR Δ (95% CI) Status
Renders/sec ↑ 20.67 20.55 -0.1% 95% CI [-1.6, +1.4] ≈ within noise
Avg Reconcile (ms) ↓ 15.9 16.0 0.0% 95% CI [-1.6, +1.5] ≈ within noise
Avg Diff (ms) ↓ 15.8 15.7 -0.2% 95% CI [-1.7, +1.3] ≈ within noise
Avg Memory (MB) ↓ 168.5 169.0 +0.1% 95% CI [-0.4, +0.6] ≈ within noise

Allocation (keyed-list) — lower is better

Metric main (baseline) This PR Δ (95% CI) Status
Alloc bytes/render ↓ 313716 307439 -1.9% 95% CI [-2.3, -1.6] ✅ improvement
Gen0 GC / 1k renders ↓ 18.61 14.49 -10.0% 95% CI [-20.6, +0.7] ≈ within noise

Reconciler micro-benchmarks (PerfBench.ControlModel)

Production --variant Reactor control-model path, ns-resolution and WinUI-undiluted (spec-047 M1–M13) — ↓ lower is better. Status tracks allocated bytes/op, the authoritative signal here; it is deterministic for structurally-fixed benches, while dispatcher / background-thread benches carry a small process-to-process offset, so a bench is flagged only when its 95% CI clears a ±3% minimum-effect band (real structural alloc changes are several percent to many-x). ns/op is shown for context but is not auto-flagged (its paired CI is rep-interleaved but the flag remains dormant pending a real-CI identical-binary band calibration). Δ is the mean paired change with a 95% CI.

Bench main ns/op Δ ns (95% CI) main B/op Δ alloc (95% CI) Status
M1 Mount_Leaf_NoCallback 149377.3 +0.1% 95% CI [-1.1, +1.3] 1140.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
M2 Mount_Leaf_OneCallback 110489.9 +0.6% 95% CI [-6.4, +7.6] 3383.3 0.0% 95% CI [0.0, 0.0] ≈ within noise
M3 Mount_Leaf_ThreeCallbacks 220396.5 -0.6% 95% CI [-5.6, +4.4] 8438.9 -0.6% 95% CI [-3.2, +2.0] ≈ within noise
M4 Dispatch_Switch_Cold 105685.8 +1.8% 95% CI [-2.8, +6.5] 1767.8 0.0% 95% CI [0.0, 0.0] ≈ within noise
M5 Dispatch_Switch_Warm 107225.2 +2.1% 95% CI [-3.6, +7.8] 1766.0 0.0% 95% CI [-1.7, +1.8] ≈ within noise
M6 Dispatch_ExternalType 91270.2 +0.2% 95% CI [-0.4, +0.8] 987.6 +0.1% 95% CI [-3.0, +3.2] ≈ within noise
M7 Update_NoChange 55354.1 +1.0% 95% CI [+0.1, +1.8] 452.1 +2.2% 95% CI [-4.6, +9.0] ≈ within noise
M8 Update_OneLeafChanged 41823.2 +1.4% 95% CI [+0.3, +2.5] 536.0 0.0% 95% CI [0.0, 0.0] ≈ within noise
M9 Update_AllChanged 2856759.2 +1.3% 95% CI [-0.7, +3.2] 184278.1 0.0% 95% CI [0.0, 0.0] ≈ within noise
M10 EventHandlerState_Alloc 85021.8 +2.0% 95% CI [-0.8, +4.8] 3095.2 0.0% 95% CI [-0.1, 0.0] ≈ within noise
M11 ModifierEHS_Frequency 46527.3 +3.4% 95% CI [-1.0, +7.8] 638.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
M12 Pool_Rent_HotPath 117414.9 +1.6% 95% CI [-1.5, +4.8] 1099.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
M13 Setters_Suppression_Scope 111.2 -4.5% 95% CI [-14.4, +5.4] 26.7 0.0% 95% CI [0.0, 0.0] ≈ within noise
M14 Dsl_Rebuild_Cascade 1565757.7 -15.8% 95% CI [-19.3, -12.4] 2231828.9 0.0% 95% CI [0.0, 0.0] ≈ within noise
C207 ChangeHandler_DpRead_Coalesce 1295.6 -0.3% 95% CI [-4.6, +4.1] 0.6 0.0% 95% CI [0.0, 0.0] ≈ within noise
OAlloc Optional_Element_Alloc 211.1 +9.4% 95% CI [-4.6, +23.3] 528.0 0.0% 95% CI [0.0, 0.0] ≈ within noise
OUpdate Optional_Reconciler_Update 12396.8 +0.3% 95% CI [-1.8, +2.4] 2772.3 0.0% 95% CI [0.0, 0.0] ≈ within noise

Cross-framework reference (same StocksGrid workload)

Metric vanilla WinUI3¹ Rust windows-reactor² Reactor (this PR)
Renders/sec ↑ 3.33 4.75 2.52
Avg Reconcile (ms) ↓ n/a 19.7 135.2
Avg Diff (ms) ↓ n/a 17.8 121.7
Avg Memory (MB) ↓ 263.9 196.4 285.8

↑ higher is better · ↓ lower is better. Within noise = the 95% confidence interval of the paired Δ includes 0 (no change resolvable at this sample size); ✅ improvement / ⚠️ regression require the CI to exclude 0.
Allocation metrics (alloc bytes/render, Gen0 GC) are the sensitive signal for allocation-reduction work, where the mean-ms / memory figures are largely flat. They read n/a for a harness built from a revision that predates them (rebase the PR onto main to populate them).
Reconciler micro-benchmarks run PerfBench.ControlModel --variant Reactor (M1–M13) as a headless loop bracketed by per-thread alloc + GC counters — ns-resolution and free of WinUI render / working-set dilution, so they resolve Core/Reconciler allocation deltas the macro StocksGrid workload cannot. main and PR each link their own src/Reactor build and are rep-interleaved (a fresh alternated process per rep); Δ is the paired 95% CI over per-rep means. The Status column tracks allocated bytes/op (deterministic for identical code); ns/op is informational — its paired CI is now unbiased but the flag stays dormant pending a real-CI identical-binary band calibration.
¹ vanilla WinUI3 = StressPerf.Direct (imperative; no virtual-DOM, so it has no reconcile/diff phase — those cells read n/a). Measured live on this runner.
² Rust = test_reactor_perf from microsoft/windows-rs — a port of this harness (same StocksGrid, same --percent/--duration CLI). Built from source and measured live on this runner.
Absolute numbers are runner-dependent — trust the Δ vs main, not the absolute values. Memory (working set) is the noisiest metric.
Runner: CPU: AMD EPYC 7763 64-Core Processor · 4 logical cores · 16 GB RAM · runner: GitHub Actions 1043006547.
Generated by .github/workflows/perf-compare.yml · PR 5cc6cbc vs main e8572a0 · 2026-06-27T07:29:42Z · run log.

azchohfi and others added 8 commits June 26, 2026 00:33
Reduces redundant WinRT/COM-RCW boundary crossings and per-element
allocations on the diff/reconcile/patch hot path (data-grid stress
workload, StocksGrid). Targets the dominant Reconcile gap.

Reconciler.cs / Update.cs / Mount.cs:
- #16 UnmountRecursive: read ReactorAttached.State once (was 5x GetValue).
- #17 UnmountAndCollect: read element tag once (was 5x GetElementTag).
- #14 Update: coalesce repeated `is FrameworkElement` RCW type-tests to one.
- #15 Cache the component rerender delegate on ComponentNode; recreate
  only when the requestRerender identity changes.
- #18 UpdateDefaultAutomationName: ref/ordinal short-circuit before the
  AutomationProperties.GetName COM read.
- #19 ApplyModifiers accepts null = clear-all (no fresh ElementModifiers).
- #20 PopulateDirtyAncestorPath iterates a tracked self-triggered set
  instead of the whole _componentNodes dict.
- #21 UnmountAndPool reuses a borrowed scratch list (reentrant calls fall
  back to a fresh list).
- #22 RefreshRealizedContainers reuses a cleared field list.
- #23 ApplyStyleToElement ref-equal + ActualTheme guard; ApplyThemeBindings
  gated on the ShallowEquals fast path.
- #24 BuildCacheKey: thread-static StringBuilder + pair buffer, single pass.
- #25 HasConsumedContextChanged compares via Context<T>/EqualityComparer<T>
  (no boxing of value-type context).
- #26 ApplyResourceOverrides: no ToHashSet/ToArray, ref-equal fast path.
- #27 WireReferenceListEdge: static loop, no per-edge List/closures.
- #28 ApplyStaggerDelays: hoisted static animation-key array.
- #29 PushContextDisposable: single-pair push overload, pooled PopOnDispose.

Reconciler.KeyedItemsBinding.cs:
- #39 ref-equality cache; only realloc + SetItemViewSource on change.

Reconciler.Mount.cs:
- #40 LayoutFootgunDetector gated by GridElement pre-filter.

AccessibilityScanner.cs (debug-only scan path):
- #62 compute GetChildren once in Walk, pass into Push.
- #63 cache sibling-text array in ScanContext, invalidate on Push.
- #64 BuildContext single-pass with pooled scratch (no LINQ chains).
- #65 IsLikelyEmoji explicit span loop (no LINQ).
- #66 CheckTabIndexGaps sorts a rented buffer (no OrderBy/ToList).

Behavior preserved exactly: coalesced COM reads observe the same values
at the same times; unmount/cleanup ordering unchanged; scanner emits
identical findings. AOT/trim clean (no new reflection).

Adds tests/Reactor.Tests/ReconcilerInteropPerfTests.cs (11 tests:
rerender-delegate caching, context-change compare, cache-key reuse,
single-pair context push, scanner output parity).

Closes #654

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Review-pass fixes for PR #666 (issue #654), all behavior-preserving and
within the assigned reconcile-path files:

- Context #25 (H1): CurrentValueEquals now reproduces the old object.Equals
  semantics EXACTLY — reference types use the virtual object.Equals (so an
  IEquatable<T> type without an object.Equals override keeps reference-equality
  change detection instead of collapsing distinct instances and skipping a
  required context rerender), value types keep the no-box EqualityComparer<T>
  path but guard the unbox with `is T` so a same-slot context-type swap returns
  false (as before) rather than throwing InvalidCastException.
- #26 ApplyResourceOverrides (CO2): removal is driven by `managed` vs new
  overrides, not an old-vs-new ref comparison; the shallow-skip fast path calls
  this with old===new yet still must reconcile away stale managed keys.
- #20 self-triggered set (CO3): add an Unmounted tombstone so a cached rerender
  closure firing after teardown can't resurrect the node into _selfTriggeredNodes
  (which the dirty-path pass would never clear again).
- #16/#17 unmount tag re-read (CO4): the V1/type unmount dispatch re-reads the
  attached tag AFTER the user .OnUnmount/RunCleanups callbacks (which may detach
  state via the public APIs), matching the pre-coalesce per-site reads; pre-callback
  branches still share the single top-of-method read.
- #23 ClearThemeBindings (AS1): clear the companion effective-theme marker DP
  alongside the style marker so the pair stays consistent across pool/reuse.

Tests (tests/Reactor.Tests/ReconcilerInteropPerfTests.cs):
- #15 rerender-delegate caching: identity reuse vs source-identity recreate;
  invocation marks SelfTriggered and bubbles to the source; post-unmount tombstone
  refuses a late self-trigger (#20).
- #25 (H1) reference-IEquatable keeps reference-equality; value-type type mismatch
  returns false without throwing.
- Scanner output parity pinned to exact literal messages (#62-#66).

GetOrCreateComponentRerender made internal (InternalsVisibleTo Reactor.Tests) as a
test seam. Validation: dotnet build src/Reactor -c Release 0/0; tests 9687 passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Guards the boxed-Nullable<T> edge raised in Copilot review: a non-null int?
boxes to a boxed underlying int, and the value-type `is T` guard (T = int?)
still matches it, so an unchanged nullable context value is not spuriously
treated as changed. Proves the H1 hybrid CurrentValueEquals preserves the
old object.Equals nullable semantics against production code.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds Assert.NotNull(wrapped) before invoking the cached rerender delegate in
the two CachedRerender_* tests. Asserts the GetOrCreateComponentRerender
contract (never returns null) and resolves a CodeQL "useless assignment"
false positive (its dataflow did not model the subsequent delegate
invocation as a read). Behavior-neutral test hardening.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Review-pass fixes from the pr-review skill (correctness + alternative-solution +
test-coverage dimensions) and the multi-model cross-check, all within the existing
reconcile interop scope. No behavior change:

- Context<T>.CurrentValueEquals (CO-H1, high): the value-type arm routed through
  EqualityComparer<T>.Default, which dispatches to IEquatable<T> and diverges from
  the original object.Equals for structs whose IEquatable<T>.Equals disagrees with
  their object.Equals override. Unify both arms to current.Equals(lastBoxed) — for
  two non-null operands that is exactly object.Equals — and keep the reference-type
  ReferenceEquals short-circuit so same-instance / non-reflexive cases stay exact.
  Still avoids boxing the freshly-read value for primitives/Nullable<T>. Drops the
  now-unused System.Collections.Generic using.
- BuildCacheKey (AS-L1): release the thread-static scratch pair array when an outlier
  element grows it past a small cap, so it is not pinned for the thread's lifetime.
- RefreshRealizedContainers (AS-L2): only retain the realized-scratch list when its
  capacity stays modest; an oversized one-off refresh is collected, not pinned.
- Tests: assert scanner emission ORDER (the parity test sorts first), plus value-type
  IEquatable-vs-object.Equals divergence and the reference-equality fast-path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses a Copilot review comment on #666: a cached rerender closure (perf #15)
handed to a long-lived or cross-thread subscriber can outlive unmount and keep its
ComponentNode alive. Drop ComponentNode.Control in ClearSelfTriggered(unmounting:true)
so a leaked closure can't also pin the unmounted visual subtree. The dirty-path pass
(perf #20) is the only reader of Control and never visits an unmounted node, so this is
observably a no-op — it only releases the subtree for GC, reinforcing the #20 tombstone.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eview)

Addresses three Copilot re-review comments on the interop-perf PR, all the
same scratch-hygiene pattern already applied to t_cacheKeyPairs (#24) and
_realizedScratch (#22): a one-off outlier input can leave a thread-static /
instance scratch buffer holding an oversized backing array (or stale element
refs) for the rest of the thread/reconciler lifetime. Behavior-neutral — the
computed outputs are unchanged, so existing parity tests cover them.

- AccessibilityScanner.BuildContext: wrap the t_buildCtxChildren populate/use
  in try/finally so a mid-population exception can't leave element refs pinned
  on the thread-static list until the next call. Success-path timing of the
  clear is unchanged.
- BuildCacheKey: mirror the pairs-array retain-cap for the thread-static
  StringBuilder (t_cacheKeySb) — release it after a one-off unusually large key
  instead of pinning the grown backing buffer.
- UnmountAndPool: mirror the _realizedScratch capacity cap for
  _unmountPoolScratch — only retain the scratch list when it's still modestly
  sized, so a one-off large-subtree unmount doesn't pin an oversized array.

Validation: dotnet build src/Reactor -c Release -> 0 warnings / 0 errors;
dotnet test tests/Reactor.Tests -> 9691 passed / 0 failed / 64 skipped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…heck)

A multi-model (gpt-5.4) cross-check of the four reconcile/scan scratch retain-caps
flagged the BuildCacheKey site: at the degenerate combined-cap case (an element with
>64 theme bindings AND a >1024-char key rendered every frame), discarding the
thread-static buffers reallocates a KeyValuePair[] (2x the size of the pre-#24
baseline's string[]) plus a fresh StringBuilder — i.e. WORSE than origin/main, which
the perf PR must never be. The size-cap's only job was bounding memory retention, but
Array.Clear(pairs) / sb.Clear() already drop the referenced strings/ThemeRefs, so
nothing is pinned regardless. Removing the caps makes BuildCacheKey unconditional-
retain, which is strictly cheaper than baseline in every case (the dominant
StringBuilder backing buffer is reused, not realloc'd) and provably never worse.

The other two caps are unchanged — the cross-check CONFIRMED them: _realizedScratch
and _unmountPoolScratch discard to EXACTLY the origin/main allocation shape
(List<UIElement> / List<FrameworkElement>), so they cannot be worse than baseline, and
both sit well above the 50-row StocksGrid steady-state high-water (~64 <= 256) so they
never trim-and-regrow on the stress workload.

Behavior-neutral: BuildCacheKey return values are unchanged (only buffer-retention
differs), so existing parity tests cover it.

Validation: dotnet build src/Reactor -c Release -> 0 warnings / 0 errors;
dotnet test tests/Reactor.Tests -> 9691 passed / 0 failed / 64 skipped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azchohfi

Copy link
Copy Markdown
Collaborator Author

/perf

@azchohfi azchohfi force-pushed the azchohfi-reconciler-interop-perf branch from 405f758 to b33b5cd Compare June 26, 2026 07:35
@azchohfi

Copy link
Copy Markdown
Collaborator Author

/perf

@azchohfi

Copy link
Copy Markdown
Collaborator Author

/perf

@azchohfi

Copy link
Copy Markdown
Collaborator Author

Closing after re-validation on current main (post-#692 + #665). Confirmed an honest no-op: the COM-interop coalescing this targets is below measurement resolution on the available workloads, and — notably — the earlier skip-floor rps -6.6% regression DID NOT REPRODUCE (now -1.1% [-4.6,+2.3], within noise — it was a macro-rps false-positive). No win, no regression. Reopen if a COM-interop-heavy workload is added that can resolve the coalescing effect.

@azchohfi azchohfi closed this Jun 27, 2026
@azchohfi

Copy link
Copy Markdown
Collaborator Author

/perf

1 similar comment
@azchohfi

Copy link
Copy Markdown
Collaborator Author

/perf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: coalesce COM-interop reads + allocations in reconcile path

2 participants