Skip to content

perf(harness): add Flex/Yoga macro workload to /perf (closes #733)#737

Merged
azchohfi merged 3 commits into
mainfrom
azchohfi-flex-perf-workload
Jun 27, 2026
Merged

perf(harness): add Flex/Yoga macro workload to /perf (closes #733)#737
azchohfi merged 3 commits into
mainfrom
azchohfi-flex-perf-workload

Conversation

@azchohfi

@azchohfi azchohfi commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

What & why

Issue #733 tracks that the /perf harness has no workload that stresses the FlexPanel / Yoga layout engine, so PR #670's layout-engine optimizations can't be measured — and #670 was closed as not-currently-measurable. This PR adds a deep, non-virtualized Flex/Yoga macro workload (StressPerf.Flex) so layout-engine allocation/measure deltas become observable.

Test-scaffolding only — no src/Reactor/** changes. This is framework-perf-neutral and safe to merge on its own. It must land on main first so the /perf baseline leg (built from origin/main) has a Flex exe; only then can #670 be revived and /perf'd to show the Flex deltas (the coordinator's follow-up).

Workload design (how it surfaces #670)

Structurally mirrors StressPerf.KeyedList byte-for-byte — only the scene + per-tick mutation differ — so Run-PerfBenchmark.ps1 / PerfLib.ps1 drive it identically.

Harness wiring

  • tests/stress_perf/StressPerf.Flex/ — new csproj + Program.cs + FlexSceneSource.cs (clone of the KeyedList recipe; AOT-clean, RegisterAllBuiltIns() prelude).
  • Run-PerfBenchmark.ps1Flex AppRegistry entry, -IncludeFlex (default $true), best-effort build, interleaved A/B leg (main-flex/pr-flex, drop-both pairing), aggregates, and -MainFlex/-PrFlex threaded into Format-PerfComment + result.json + ctx FlexSamples + the perf-counts log line.
  • PerfLib.ps1Format-PerfFlexSection (4 headline metrics + shared alloc sub-table, Flex-appropriate heading/preamble), invoked after the keyed-list section.
  • Reactor.slnx — add the project.
  • Pester: cloned keyed-list cases in PerfLib.Tests.ps1 and RunPerfBenchmark.Tests.ps1 (static wiring contract).
  • perf-compare.yml — no change needed; the leg runs by default via the -IncludeFlex default.

Validation

  • dotnet build StressPerf.Flex.csproj -c Release0 warning / 0 error.
  • dotnet build Reactor.slnx -c Releasebuild succeeded (2 pre-existing VSTHRD103 warnings in an unrelated VS-extension project).
  • pwsh PerfLib.Tests.ps1PASSED all 304 assertions.
  • pwsh RunPerfBenchmark.Tests.ps1PASSED all 88 assertions.
  • Local headless smoke (2000 leaves, --percent 50 --duration 10) emitted sensible non-zero metrics, with the alloc sub-table inputs (allocBytesPerRender + gen0PerKRenders) present: avgReconcileMs ≈ 120 ms, avgDiffMs ≈ 95 ms, avgMemoryMB ≈ 355, allocBytesPerRender ≈ 8.14M, gen0 = 23, gen0PerKRenders ≈ 1438. (Building the instrument — the authoritative measurement is CI /perf.)

Follow-up

Once this lands on main, #670 (perf: Yoga layout-cache guards + inline per-node arrays) can be update-branched and /perf'd so the new Flex leg reports its layout-engine alloc + memory deltas.

azchohfi and others added 3 commits June 27, 2026 09:30
Add StressPerf.Flex, a deep nested non-virtualized FlexPanel/Yoga macro workload to the /perf harness so PR #670's layout-engine alloc/memory optimizations become measurable. The scene re-lays-out a ~1500-leaf flex tree each 33ms tick and re-rolls grow/basis/width on a --percent fraction of leaves (real Yoga relayout), while the rest re-push identical inputs (the YogaNode setter-equality-guard cache-hit path). Structurally mirrors StressPerf.KeyedList so Run-PerfBenchmark.ps1/PerfLib.ps1 drive it identically; wires a -IncludeFlex A/B leg, Format-PerfFlexSection, slnx entry, and cloned Pester cases. Test-scaffolding only; no src/Reactor changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…check

Address coordinator review: bump the Flex scene to a single easy-to-bump leaf-target const (DefaultLeafTarget=2000, derived section count) so the inline-per-node-memory (#142/#143) win survives the noisy Avg-Memory-MB metric; strengthen the structural self-check to also assert the leaf survives the .Flex().Width() modifier chain as a concrete TextBlockElement (so a degraded leaf that drops its grow/basis/width inputs — a no-op mutation — fails loudly).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Multi-model review follow-ups (scoped, no workload-logic changes):
- Add a measurement caveat in two places (Format-PerfFlexSection rendered preamble + Program.cs header): the reconcile/diff ms rows do not capture the deferred Yoga Measure/Arrange pass (runs after OnRenderComplete); judge layout-engine wins on the flex allocation table + renders/sec; working-set memory is informational (too coarse for inline-array gains at this scale).
- RunPerfBenchmark.Tests.ps1: assert result.json carries mainFlex/prFlex.
- PerfLib.Tests.ps1: full-comment test asserting both Allocation (Reactor) and Allocation (flex) tables render.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azchohfi azchohfi marked this pull request as ready for review June 27, 2026 17:25
@azchohfi azchohfi merged commit afa40e1 into main Jun 27, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant