perf(harness): add Flex/Yoga macro workload to /perf (closes #733)#737
Merged
Conversation
Add StressPerf.Flex, a deep nested non-virtualized FlexPanel/Yoga macro workload to the /perf harness so PR #670's layout-engine alloc/memory optimizations become measurable. The scene re-lays-out a ~1500-leaf flex tree each 33ms tick and re-rolls grow/basis/width on a --percent fraction of leaves (real Yoga relayout), while the rest re-push identical inputs (the YogaNode setter-equality-guard cache-hit path). Structurally mirrors StressPerf.KeyedList so Run-PerfBenchmark.ps1/PerfLib.ps1 drive it identically; wires a -IncludeFlex A/B leg, Format-PerfFlexSection, slnx entry, and cloned Pester cases. Test-scaffolding only; no src/Reactor changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…check Address coordinator review: bump the Flex scene to a single easy-to-bump leaf-target const (DefaultLeafTarget=2000, derived section count) so the inline-per-node-memory (#142/#143) win survives the noisy Avg-Memory-MB metric; strengthen the structural self-check to also assert the leaf survives the .Flex().Width() modifier chain as a concrete TextBlockElement (so a degraded leaf that drops its grow/basis/width inputs — a no-op mutation — fails loudly). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Multi-model review follow-ups (scoped, no workload-logic changes): - Add a measurement caveat in two places (Format-PerfFlexSection rendered preamble + Program.cs header): the reconcile/diff ms rows do not capture the deferred Yoga Measure/Arrange pass (runs after OnRenderComplete); judge layout-engine wins on the flex allocation table + renders/sec; working-set memory is informational (too coarse for inline-array gains at this scale). - RunPerfBenchmark.Tests.ps1: assert result.json carries mainFlex/prFlex. - PerfLib.Tests.ps1: full-comment test asserting both Allocation (Reactor) and Allocation (flex) tables render. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Issue #733 tracks that the
/perfharness has no workload that stresses the FlexPanel / Yoga layout engine, so PR #670's layout-engine optimizations can't be measured — and #670 was closed as not-currently-measurable. This PR adds a deep, non-virtualized Flex/Yoga macro workload (StressPerf.Flex) so layout-engine allocation/measure deltas become observable.Test-scaffolding only — no
src/Reactor/**changes. This is framework-perf-neutral and safe to merge on its own. It must land onmainfirst so the/perfbaseline leg (built fromorigin/main) has a Flex exe; only then can #670 be revived and/perf'd to show the Flex deltas (the coordinator's follow-up).Workload design (how it surfaces #670)
root FlexColumn → N section FlexColumns → 10 FlexRows → 10 leaf cells, sized by a single bumpable constDefaultLeafTarget = 2000(→ 20 sections → 2000 leaves) over three container levels, rebuilt every render (no memo fast-path), not in a virtualizing host. Bump the one const if a smoke run shows the alloc/memory delta too small to clear noise.DispatcherTimerre-rolls each leaf'sgrow/basis/widthon a--percentfraction of leaves → a real Yoga measure/layout pass every frame (exercises docs(skill): correct selftest command in SKILL.md #141/fix(hosting): chain entry-assembly and registered XAML metadata providers #144 list/line pooling, [Bug] Controls with readonly dependency properties are not supported #142/chore(samples): remove reactorfiles sample #143 inline per-node arrays, build: migrate from .NET 9 to .NET 10 (LTS) #147 attached-DP push caching).(1 − percent)leaves re-push identical Yoga inputs each render — exactly the chore(deps): bump Microsoft.WindowsAppSDK 2.0.0-preview2 → 2.0.1 #138 YogaNode setter-equality-guard (no re-dirty → layout cache hits) path. The full tree is re-built and every leaf re-sets its props each tick (the workload never skips unchanged children — that split is the point).--percent 0is the all-unchanged floor.Environment.FailFastasserts the representative containers areFlexElement-backed and the leaf survives the.Flex().Width()modifier chain as a concreteTextBlockElement, so a mis-built (non-Yoga) or input-dropping (no-op-mutation) scene fails loudly instead of reporting misleading numbers.StressPerf.VirtualListis exactly why it can't surface perf: Yoga layout-cache guards + inline per-node arrays #670).Structurally mirrors
StressPerf.KeyedListbyte-for-byte — only the scene + per-tick mutation differ — soRun-PerfBenchmark.ps1/PerfLib.ps1drive it identically.Harness wiring
tests/stress_perf/StressPerf.Flex/— newcsproj+Program.cs+FlexSceneSource.cs(clone of the KeyedList recipe; AOT-clean,RegisterAllBuiltIns()prelude).Run-PerfBenchmark.ps1—FlexAppRegistry entry,-IncludeFlex(default$true), best-effort build, interleaved A/B leg (main-flex/pr-flex, drop-both pairing), aggregates, and-MainFlex/-PrFlexthreaded intoFormat-PerfComment+result.json+ ctxFlexSamples+ the perf-counts log line.PerfLib.ps1—Format-PerfFlexSection(4 headline metrics + shared alloc sub-table, Flex-appropriate heading/preamble), invoked after the keyed-list section.Reactor.slnx— add the project.PerfLib.Tests.ps1andRunPerfBenchmark.Tests.ps1(static wiring contract).perf-compare.yml— no change needed; the leg runs by default via the-IncludeFlexdefault.Validation
dotnet build StressPerf.Flex.csproj -c Release→ 0 warning / 0 error.dotnet build Reactor.slnx -c Release→ build succeeded (2 pre-existing VSTHRD103 warnings in an unrelated VS-extension project).pwsh PerfLib.Tests.ps1→ PASSED all 304 assertions.pwsh RunPerfBenchmark.Tests.ps1→ PASSED all 88 assertions.--percent 50 --duration 10) emitted sensible non-zero metrics, with the alloc sub-table inputs (allocBytesPerRender+gen0PerKRenders) present:avgReconcileMs ≈ 120 ms,avgDiffMs ≈ 95 ms,avgMemoryMB ≈ 355,allocBytesPerRender ≈ 8.14M,gen0 = 23,gen0PerKRenders ≈ 1438. (Building the instrument — the authoritative measurement is CI/perf.)Follow-up
Once this lands on
main, #670 (perf: Yoga layout-cache guards + inline per-node arrays) can be update-branched and/perf'd so the new Flex leg reports its layout-engine alloc + memory deltas.