Optimize validity checks by resolving to masks upfront by joseph-isaacs · Pull Request #8306 · vortex-data/vortex

joseph-isaacs · 2026-06-09T10:57:48Z

Summary

This PR optimizes performance by resolving Validity objects to Mask objects once upfront, rather than calling Validity::is_valid() repeatedly in loops. This is particularly important for array-backed validity, where each is_valid() call executes a scalar operation and spins up an execution context.

The changes follow a consistent pattern across multiple files:

Call validity.execute_mask() once to resolve to a concrete Mask
Use mask.value(idx) for O(1) lookups instead of O(n) scalar executions per call
Handle the three cases of Mask::bit_buffer(): AllOr::All, AllOr::None, and AllOr::Some

Key Changes

varbin filter: Changed from Validity to Mask parameter, with optimized handling of all/none/some cases
display module: Resolve validity once for table display instead of per-row checks
listview rebuild: Resolve validity once for both rebuild_with_take and rebuild_with_builder
vortex-tui layouts: Canonicalize struct fields once to avoid re-decoding per row
vortex-tensor l2_denorm: Resolve validity masks once for normalization loops
list/listview/fixed_size_list operations: Canonicalize element arrays once before per-element scalar execution
variant kernel: Canonicalize inputs once before per-row merging

All changes maintain the same logical behavior while improving performance by reducing redundant work in hot loops.

Testing

Existing tests pass. The changes are refactorings that preserve behavior while optimizing performance characteristics. The pattern of resolving validity to masks upfront is already used elsewhere in the codebase and is well-tested through existing array operation tests.

https://claude.ai/code/session_016i3hghJiXxmtWWcVu55Pjn

`rebuild_with_take` and `rebuild_list_by_list` probed validity per row via `self.validity()?.is_valid(index)?`. For array-backed validity this executes a scalar (and creates an execution context) on every row, turning an O(n) loop into O(n) per-row work; even for non-nullable validity it reconstructed the `Validity` enum on every iteration. Resolve validity to a `Mask` once before the loop with `execute_mask`, then use the O(1) `Mask::value(index)` lookup. Add nullable ListView rebuild benchmarks (the existing ones only covered non-nullable validity). Measured (divan, this machine): - listview_rebuild i32_small_nullable: 231.7 us -> 33.55 us (~6.9x) - listview_rebuild i32_large_nullable: 1.244 ms -> 1.014 ms (~1.23x) - zstd listview_rebuild rebuild_naive (non-nullable, 1024 lists): 41.1 us -> 30.1 us (~1.37x) Signed-off-by: Claude <noreply@anthropic.com>

Two more sites that probed validity per row via `Validity::is_valid` (which executes a scalar per call for array-backed validity): - VarBin filter index path (`filter_select_var_bin_by_index`): now resolves the validity mask once with `execute_mask` and matches on `AllOr` fast paths, the same way the slice path already does. Removes the lingering `TODO(ngates): pass LogicalValidity instead`. - L2Denorm normalize and verify loops: resolve the validity mask once before the per-row loop instead of calling `is_valid(i)` each iteration. Signed-off-by: Claude <noreply@anthropic.com>

…ild benches Replace per-element/per-row `execute_scalar` decode loops with a single canonicalization of the source before iterating: - List/ListView/FixedSizeList `scalar_at`: canonicalize the element slice once so building the list scalar does not re-decode the element encoding per element. - Variant `merge_typed_as_variant`: canonicalize the `typed`/`fallback` inputs once before the row-wise fallback loop. Also remove the nullable ListView rebuild benchmarks that were added earlier. Signed-off-by: Claude <noreply@anthropic.com>

- Table display: reuse a single execution context across the struct table loops and resolve validity to a mask once, instead of creating a context and probing validity per row. - TUI browse stats table: canonicalize each field array once before the per-row `execute_scalar` loop so field encodings are not re-decoded per row. Signed-off-by: Claude <noreply@anthropic.com>

codspeed-hq · 2026-06-09T11:06:58Z

Merging this PR will improve performance by 20.44%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 6 improved benchmarks
❌ 3 regressed benchmarks
✅ 1504 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`varbinview_zip_block_mask`	2.9 ms	3.7 ms	-21.57%
❌	Simulation	`encode_varbin[(1000, 2)]`	143.1 µs	164.1 µs	-12.77%
❌	Simulation	`varbinview_zip_fragmented_mask`	6.1 ms	6.9 ms	-11.3%
⚡	Simulation	`extend_from_array_non_zctl_overlapping[(10000, 8)]`	4.7 ms	2.3 ms	×2
⚡	Simulation	`extend_from_array_non_zctl_overlapping[(1000, 8)]`	527.9 µs	287.6 µs	+83.56%
⚡	Simulation	`rebuild_naive`	142.1 µs	104.9 µs	+35.46%
⚡	Simulation	`extend_from_array_non_zctl_overlapping[(1000, 32)]`	1,023.8 µs	782.9 µs	+30.76%
⚡	Simulation	`extend_from_array_zctl[(10000, 8)]`	2.5 ms	2.1 ms	+17.04%
⚡	Simulation	`extend_from_array_zctl[(1000, 8)]`	291 µs	254.7 µs	+14.25%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing claude/audit-per-row-accessors-ln5lho (2c2f634) with develop (1b19ac9)}

claude added 4 commits June 8, 2026 21:48

joseph-isaacs closed this Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize validity checks by resolving to masks upfront#8306

Optimize validity checks by resolving to masks upfront#8306
joseph-isaacs wants to merge 4 commits into
developfrom
claude/audit-per-row-accessors-ln5lho

joseph-isaacs commented Jun 9, 2026

Uh oh!

codspeed-hq Bot commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joseph-isaacs commented Jun 9, 2026

Summary

Key Changes

Testing

Uh oh!

codspeed-hq Bot commented Jun 9, 2026

Merging this PR will improve performance by 20.44%

Performance Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants