Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions tests/stress_perf/METHODOLOGY.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,8 +257,11 @@ a real keyed diff every tick. The workload is deterministic (fixed RNG seed, con
row count — insertions paired with removals) so `main` and PR compare identical edit
sequences, and its rows' labels are content-stable so a moved row's text never changes
— isolating the **structural** (keyed-diff) signal from per-cell property updates. It
reports the four headline metrics in its own table under the same interleaving, reps,
warm-up, and 95%-CI gating as the headline leg, and is opt-out via
reports the four headline metrics in its own table, plus an **allocation** sub-table
(`Alloc bytes/render`, `Gen0 GC / 1k renders`) — the sensitive macro signal for
keyed-diff *allocation* reductions the positional StocksGrid alloc table can't isolate,
rendered only when the keyed leg reports the metric — all under the same interleaving,
reps, warm-up, and 95%-CI gating as the headline leg, and is opt-out via
`-IncludeKeyedList $false`. See
[`ci/README.md`](ci/README.md#the-comment).

Expand Down
39 changes: 39 additions & 0 deletions tests/stress_perf/ci/PerfLib.Tests.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -507,6 +507,45 @@ $noKeyedComment = Format-PerfComment -Main $main -Pr $pr -WinUI3 $null -Rust $nu
Assert-True (-not ($noKeyedComment -like '*Keyed-list workload*')) 'keyed-list table omitted when keyed aggregates null'


# ── Keyed-list allocation sub-table: shared PerfAllocMetricSpec over keyed aggregates ──
# The keyed leg also renders an allocation sub-table (alloc bytes/render + Gen0 GC / 1k
# renders) — the macro signal for keyed-DIFF allocation reductions. Alloc moves DOWN
# main->PR (~20%, an improvement on a lower-is-better metric); tiny jitter keeps each
# paired CI off 0. Magnitudes mirror a real StressPerf.KeyedList @50% run (~328K
# bytes/render, ~63 Gen0/1k).
$keyedAllocMain = Measure-PerfRuns -Runs @(
[pscustomobject]@{ RendersPerSec = 18.5; AvgReconcileMs = 6.98; AvgDiffMs = 6.86; AvgMemoryMB = 186; AllocBytesPerRender = 328000; Gen0PerKRenders = 63.2; Gen0 = 6; Gen1 = 2; Gen2 = 1; TotalRenders = 96; DurationSeconds = 5 }
[pscustomobject]@{ RendersPerSec = 18.6; AvgReconcileMs = 6.98; AvgDiffMs = 6.86; AvgMemoryMB = 186; AllocBytesPerRender = 328200; Gen0PerKRenders = 63.4; Gen0 = 6; Gen1 = 2; Gen2 = 1; TotalRenders = 96; DurationSeconds = 5 }
[pscustomobject]@{ RendersPerSec = 18.4; AvgReconcileMs = 6.98; AvgDiffMs = 6.86; AvgMemoryMB = 186; AllocBytesPerRender = 327800; Gen0PerKRenders = 63.0; Gen0 = 6; Gen1 = 2; Gen2 = 1; TotalRenders = 96; DurationSeconds = 5 }
)
$keyedAllocPr = Measure-PerfRuns -Runs @(
[pscustomobject]@{ RendersPerSec = 18.5; AvgReconcileMs = 6.98; AvgDiffMs = 6.86; AvgMemoryMB = 186; AllocBytesPerRender = 262000; Gen0PerKRenders = 50.2; Gen0 = 5; Gen1 = 2; Gen2 = 1; TotalRenders = 96; DurationSeconds = 5 }
[pscustomobject]@{ RendersPerSec = 18.6; AvgReconcileMs = 6.98; AvgDiffMs = 6.86; AvgMemoryMB = 186; AllocBytesPerRender = 262200; Gen0PerKRenders = 50.4; Gen0 = 5; Gen1 = 2; Gen2 = 1; TotalRenders = 96; DurationSeconds = 5 }
[pscustomobject]@{ RendersPerSec = 18.4; AvgReconcileMs = 6.98; AvgDiffMs = 6.86; AvgMemoryMB = 186; AllocBytesPerRender = 261800; Gen0PerKRenders = 50.0; Gen0 = 5; Gen1 = 2; Gen2 = 1; TotalRenders = 96; DurationSeconds = 5 }
)
$keyedAllocSection = Format-PerfKeyedListSection -MainKeyed $keyedAllocMain -PrKeyed $keyedAllocPr -Percent 50
$keyedAllocText = $keyedAllocSection -join "`n"
Assert-Match $keyedAllocText 'Allocation (keyed-list)' 'keyed section renders the allocation sub-table when alloc present'
Assert-Match $keyedAllocText 'Alloc bytes/render' 'keyed alloc sub-table has bytes/render row'
Assert-Match $keyedAllocText 'Gen0 GC / 1k renders' 'keyed alloc sub-table has Gen0 row'
$keyedAllocRow = ($keyedAllocSection | Where-Object { $_ -match 'Alloc bytes/render' }) -join ' '
Assert-Match $keyedAllocRow 'improvement' 'keyed alloc DOWN main->PR reads improvement (lower-is-better honored)'
# The allocation sub-table sits AFTER the keyed headline metrics table within the section.
$idxKeyedHead = $keyedAllocText.IndexOf('Avg Reconcile')
$idxKeyedAlloc = $keyedAllocText.IndexOf('Allocation (keyed-list)')
Assert-True (($idxKeyedHead -ge 0) -and ($idxKeyedHead -lt $idxKeyedAlloc)) 'keyed alloc sub-table follows the keyed headline metrics table'

# Omitted when the keyed aggregates carry no alloc metrics (legacy keyed head). The
# $keyedMain/$keyedPr aggregates above were built without alloc fields.
Assert-True (-not ($keyedSectionText -like '*Allocation (keyed-list)*')) 'keyed alloc sub-table omitted when keyed aggregates lack alloc'

# In a full comment the positional StocksGrid allocation table and the keyed allocation
# sub-table are DISTINCT, separately-labelled tables (positional vs keyed workload allocs).
$bothAllocComment = Format-PerfComment -Main $allocMain -Pr $allocPr -WinUI3 $null -Rust $null -MainKeyed $keyedAllocMain -PrKeyed $keyedAllocPr -Context $ctx
Assert-Match $bothAllocComment 'Allocation (Reactor)' 'full comment keeps the StocksGrid allocation table'
Assert-Match $bothAllocComment 'Allocation (keyed-list)' 'full comment adds the distinct keyed allocation sub-table'


# ── Reconciler micro-suite: Read-MicroBenchResults / comparison / render ──────
function New-MicroRow {
param([string]$BenchId, [string]$Name, [string]$Variant, [int]$Rep, [double]$MeanNs, [double]$AllocBytes, [string]$Status = 'ok', [int]$Iterations = 1)
Expand Down
49 changes: 43 additions & 6 deletions tests/stress_perf/ci/PerfLib.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -745,9 +745,10 @@ function Format-PerfSkipFloorSection {
function Format-PerfKeyedListSection {
<#
.SYNOPSIS
Render the keyed-list workload table: the four headline metrics measured on
StressPerf.KeyedList — a ~500-row stably keyed list whose rows are reordered /
inserted / removed each tick. Empty array when there is nothing to show.
Render the keyed-list workload section: the four headline metrics plus an
allocation sub-table measured on StressPerf.KeyedList — a ~500-row stably keyed
list whose rows are reordered / inserted / removed each tick. Empty array when
there is nothing to show.
.DESCRIPTION
Unlike the positional StocksGrid headline/skip-floor legs (whose cells mutate
in place by index, always taking ChildReconciler.ReconcilePositional), this is
Expand All @@ -756,9 +757,13 @@ function Format-PerfKeyedListSection {
the sensitive macro measure for keyed-diff optimizations (keyed-list diff,
keyed structural-skip) that the StocksGrid workload can never exercise. Reuses
the same paired-Δ 95% CI machinery (Get-PerfDelta over the index-aligned
per-run samples) as the headline table. Returns an empty array when either
aggregate is $null (keyed-list leg disabled, build omitted, or one side
produced no metrics), so the caller renders nothing.
per-run samples) as the headline table. Also appends an **allocation** sub-table
— the shared PerfAllocMetricSpec (alloc bytes/render + Gen0 GC / 1k renders) over
the keyed aggregates — the sensitive macro signal for keyed-DIFF allocation
reductions that the positional StocksGrid allocation table can never isolate;
rendered only when the keyed leg reported allocation metrics. Returns an empty
array when either aggregate is $null (keyed-list leg disabled, build omitted, or
one side produced no metrics), so the caller renders nothing.
.PARAMETER MainKeyed Aggregated baseline keyed-list metrics (Measure-PerfRuns), or $null.
.PARAMETER PrKeyed Aggregated PR-head keyed-list metrics, or $null.
.PARAMETER Percent The mutation percent the keyed-list leg ran at (heading / preamble).
Expand Down Expand Up @@ -791,6 +796,38 @@ function Format-PerfKeyedListSection {
(Get-PerfStatusGlyph $delta.Status)))
}
$lines.Add('')

# Allocation sub-table for the keyed workload: the shared PerfAllocMetricSpec
# (Alloc bytes/render, Gen0 GC / 1k renders) rendered over the keyed aggregates with
# the identical paired-Δ 95% CI machinery used above. This is the sensitive MACRO
# signal for keyed-DIFF allocation reductions — allocBytesPerRender tracks reorder
# volume on the keyed (ReconcileKeyed → ReconcileKeyedMiddle / LIS) path, an alloc
# signal the positional StocksGrid allocation table can never isolate. Rendered only
# when the keyed leg reported allocation metrics (every StressPerf.KeyedList build
# does; n/a only for a legacy head opened before the metric landed).
$hasKeyedAlloc = ($null -ne $MainKeyed.AllocBytesPerRender) -or ($null -ne $PrKeyed.AllocBytesPerRender) -or
($null -ne $MainKeyed.Gen0PerKRenders) -or ($null -ne $PrKeyed.Gen0PerKRenders)
if ($hasKeyedAlloc) {
$lines.Add('**Allocation (keyed-list)** &mdash; lower is better')
$lines.Add('')
$lines.Add('| Metric | `main` (baseline) | This PR | Δ (95% CI) | Status |')
$lines.Add('|---|--:|--:|--:|:--|')
foreach ($m in $script:PerfAllocMetricSpec) {
$bVal = $MainKeyed.($m.Key)
$pVal = $PrKeyed.($m.Key)
$spread = [math]::Max([double]$MainKeyed."$($m.Key)Spread", [double]$PrKeyed."$($m.Key)Spread")
$delta = Get-PerfDelta -Baseline $bVal -Candidate $pVal -LowerIsBetter $m.LowerIsBetter -SpreadPct $spread `
-BaselineSamples $MainKeyed."$($m.Key)Samples" -CandidateSamples $PrKeyed."$($m.Key)Samples"
$lines.Add(('| {0} {1} | {2} | {3} | {4} | {5} |' -f `
$m.Label, $m.Arrow, `
(Format-PerfNumber $bVal $m.Digits), `
(Format-PerfNumber $pVal $m.Digits), `
(Format-PerfDeltaCell $delta), `
(Get-PerfStatusGlyph $delta.Status)))
}
$lines.Add('')
}

return $lines.ToArray()
}

Expand Down
6 changes: 5 additions & 1 deletion tests/stress_perf/ci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,11 @@ Several tables plus footnotes:
`ReconcileKeyedMiddle`, the LIS-based minimal-move pass) instead of the positional
re-walk the StocksGrid tables measure — so this is the sensitive macro signal for
**keyed-diff** optimizations (keyed-list diff, keyed structural-skip) that the
positional cells can never exercise. Same paired-CI gating as Table 1; omitted
positional cells can never exercise. It also carries its own **allocation**
sub-table (`Alloc bytes/render` + `Gen0 GC / 1k renders` over the keyed aggregates,
same spec as the StocksGrid allocation table) — the sensitive macro signal for
keyed-diff *allocation* reductions the positional alloc table can't isolate, rendered
only when the keyed leg reports the metric. Same paired-CI gating as Table 1; omitted
when `-IncludeKeyedList $false`, the workload build fails, or a side produces no
metrics.
- **Reconciler micro-benchmarks** — per-bench `ns/op` and `B/op` from the
Expand Down
Loading