Summary
The "layout-cache" setter equality-guards added to YogaNode (the #138 piece of PR #670) are a severe performance regressor on dynamic flex trees — roughly 2× the Yoga layout pass / −39.6% Renders/sec on the Flex/Yoga macro workload. They were reverted in #740 (which keeps #670's other, genuinely-clean optimizations). This issue records the finding so the guards aren't reintroduced, and notes the one conditional way the idea could be salvaged.
What #138 did
Each YogaNode style setter was guarded with an equality check (if (current == value) return;) so that re-applying an unchanged style value each frame would NOT re-dirty the node — intended to let the frame-level Yoga layout cache hit for stable cells.
Why it regresses dynamic trees
FlexPanel re-applies every child's attached style each MeasureOverride. With the guards suppressing the per-frame re-dirty, the non-dirty nodes fall into Yoga's CanUseCachedMeasurement validation path (which scans up to 8 cached measurements per node). On a churning (dynamic) tree the available-size keys shift every frame, so the cache misses every frame anyway — and the guarded path does that validation scan plus still measures, i.e. strictly more work than main's simpler dirty-clear-then-measure-once. The guards only pay off on static trees (where the cache genuinely hits).
Evidence
Resolution
#740 reverts the #138 unit entirely (the YogaNode setter guards + the FlexPanel route-through-guarded-setters + the YogaAlgorithm per-generation flex-basis compensation hunk), restoring main's unconditional-dirty behavior, while keeping #670's clean wins (inline per-node arrays, list/line pooling, attached-DP cache) → net −6.4% Flex allocation, throughput neutral.
Possible future work (de-prioritized)
#138's cache-hit benefit is real for static trees. It could theoretically be salvaged by gating the guards to a static-tree heuristic (only skip re-dirty when the subtree isn't churning), but that requires a reliable static-vs-dynamic signal and reintroduces the −39.6% risk if mis-detected — high risk for a benefit that only applies to non-animating UI. Not pursued now. The StressPerf.Flex /perf leg can validate any future attempt.
Lesson: "skip work if the input is unchanged" guards can backfire when a downstream cache's validation cost exceeds the work they skip — measure on a dynamic workload before assuming a cache-guard helps.
Summary
The "layout-cache" setter equality-guards added to
YogaNode(the #138 piece of PR #670) are a severe performance regressor on dynamic flex trees — roughly 2× the Yoga layout pass / −39.6% Renders/sec on the Flex/Yoga macro workload. They were reverted in #740 (which keeps #670's other, genuinely-clean optimizations). This issue records the finding so the guards aren't reintroduced, and notes the one conditional way the idea could be salvaged.What #138 did
Each
YogaNodestyle setter was guarded with an equality check (if (current == value) return;) so that re-applying an unchanged style value each frame would NOT re-dirty the node — intended to let the frame-level Yoga layout cache hit for stable cells.Why it regresses dynamic trees
FlexPanelre-applies every child's attached style eachMeasureOverride. With the guards suppressing the per-frame re-dirty, the non-dirty nodes fall into Yoga'sCanUseCachedMeasurementvalidation path (which scans up to 8 cached measurements per node). On a churning (dynamic) tree the available-size keys shift every frame, so the cache misses every frame anyway — and the guarded path does that validation scan plus still measures, i.e. strictly more work than main's simpler dirty-clear-then-measure-once. The guards only pay off on static trees (where the cache genuinely hits).Evidence
StressPerf.Flex, added in perf(harness): add Flex/Yoga macro workload to /perf (closes #733) #737) measured perf: Yoga layout-cache guards + inline per-node arrays #670: Flex Renders/sec −39.6% [−43.1,−36.1].Resolution
#740 reverts the #138 unit entirely (the
YogaNodesetter guards + theFlexPanelroute-through-guarded-setters + theYogaAlgorithmper-generation flex-basis compensation hunk), restoring main's unconditional-dirty behavior, while keeping #670's clean wins (inline per-node arrays, list/line pooling, attached-DP cache) → net −6.4% Flex allocation, throughput neutral.Possible future work (de-prioritized)
#138's cache-hit benefit is real for static trees. It could theoretically be salvaged by gating the guards to a static-tree heuristic (only skip re-dirty when the subtree isn't churning), but that requires a reliable static-vs-dynamic signal and reintroduces the −39.6% risk if mis-detected — high risk for a benefit that only applies to non-animating UI. Not pursued now. The
StressPerf.Flex/perf leg can validate any future attempt.Lesson: "skip work if the input is unchanged" guards can backfire when a downstream cache's validation cost exceeds the work they skip — measure on a dynamic workload before assuming a cache-guard helps.