Skip to content

fix[gpu]: decode extension arrays on the GPU#8353

Merged
0ax1 merged 1 commit into
developfrom
ad/gpu-decode-ext-storage
Jun 11, 2026
Merged

fix[gpu]: decode extension arrays on the GPU#8353
0ax1 merged 1 commit into
developfrom
ad/gpu-decode-ext-storage

Conversation

@0ax1

@0ax1 0ax1 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Fixes #8143

@0ax1 0ax1 requested review from a team, joseph-isaacs and onursatici June 11, 2026 09:58
@0ax1 0ax1 changed the title feat[gpu]: decode compressed extension-typed columns on GPU fix[gpu]: decode compressed extension-typed columns on GPU Jun 11, 2026
@0ax1 0ax1 added the changelog/fix A bug fix label Jun 11, 2026
Extension arrays match AnyCanonical regardless of how their storage is
encoded, so execute_cuda returned them via the canonical early-return
and compressed storage (e.g. ext(date) -> fastlanes.for ->
fastlanes.bitpacked) was never decoded by a GPU kernel. Recurse into
the storage array before the canonical check, mirroring the existing
Struct field recursion.

Fixes #8143

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
@0ax1 0ax1 force-pushed the ad/gpu-decode-ext-storage branch from b6c0c61 to 2d6871c Compare June 11, 2026 10:04
@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 6 improved benchmarks
❌ 6 regressed benchmarks
✅ 1520 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_bool_canonical_into[(1000, 10)] 20.2 µs 35.2 µs -42.51%
Simulation chunked_varbinview_into_canonical[(1000, 10)] 177.2 µs 213.3 µs -16.93%
Simulation varbinview_large 113.1 µs 131.7 µs -14.14%
Simulation decompress_rd[f64, (100000, 0.0)] 845.2 µs 980 µs -13.76%
Simulation bitwise_not_vortex_buffer_mut[128] 215.3 ns 244.4 ns -11.93%
Simulation chunked_varbinview_canonical_into[(100, 100)] 274.4 µs 309.3 µs -11.27%
Simulation decompress_rd[f64, (10000, 0.0)] 138 µs 110.9 µs +24.41%
Simulation decompress_rd[f64, (10000, 0.1)] 137.8 µs 110.9 µs +24.24%
Simulation decompress_rd[f64, (10000, 0.01)] 137.4 µs 110.6 µs +24.23%
Simulation decompress_rd[f32, (10000, 0.1)] 89.3 µs 80.2 µs +11.4%
Simulation decompress_rd[f32, (10000, 0.0)] 89.6 µs 80.8 µs +10.93%
Simulation decompress_rd[f32, (10000, 0.01)] 89.3 µs 80.7 µs +10.66%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing ad/gpu-decode-ext-storage (2d6871c) with develop (3d7bbfb)

Open in CodSpeed

@0ax1 0ax1 changed the title fix[gpu]: decode compressed extension-typed columns on GPU fix[gpu]: decode extension-typed columns on GPU Jun 11, 2026
@0ax1 0ax1 enabled auto-merge (squash) June 11, 2026 10:14
@0ax1 0ax1 changed the title fix[gpu]: decode extension-typed columns on GPU fix[gpu]: decode extension arrays on GPU Jun 11, 2026
@0ax1 0ax1 changed the title fix[gpu]: decode extension arrays on GPU fix[gpu]: decode extension arrays on the GPU Jun 11, 2026
@0ax1 0ax1 merged commit a289c23 into develop Jun 11, 2026
63 of 64 checks passed
@0ax1 0ax1 deleted the ad/gpu-decode-ext-storage branch June 11, 2026 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/fix A bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GPU decode silently skips extension-typed columns (date/timestamp)

2 participants