Skip to content

Minor touchups for vortex-datafusion#8356

Merged
AdamGS merged 1 commit into
developfrom
adamg/df-minor-touchup-20260611
Jun 11, 2026
Merged

Minor touchups for vortex-datafusion#8356
AdamGS merged 1 commit into
developfrom
adamg/df-minor-touchup-20260611

Conversation

@AdamGS

@AdamGS AdamGS commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Some minor touchups to the FileSource based DataFusion integration.

The two substantial changes here are:

  1. When reading files, if we didn't have the footer in the cache, make sure to insert it. That can happen when using ListingTable without stats inference, or when using FileScanConfig directly in another table provider.
  2. On write - move the schema-to-dtype logic outside of the loop. It only needs to happen once and the dtype is cloned per write task.

Signed-off-by: Adam Gutglick <adam@spiraldb.com>
@AdamGS AdamGS requested review from a team, a10y and asubiotto June 11, 2026 12:56
@AdamGS AdamGS added ext/datafusion Relates to the DataFusion integration changelog/chore A trivial change labels Jun 11, 2026
@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 4 improved benchmarks
❌ 5 regressed benchmarks
✅ 1523 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation decompress_rd[f64, (100000, 0.01)] 845.7 µs 981.3 µs -13.82%
Simulation decompress_rd[f64, (100000, 0.1)] 845.7 µs 981.2 µs -13.82%
Simulation encode_varbin[(1000, 4)] 143.2 µs 160.3 µs -10.65%
Simulation encode_varbin[(1000, 8)] 143.9 µs 161 µs -10.64%
Simulation encode_varbin[(1000, 32)] 148.8 µs 165.6 µs -10.14%
Simulation decompress_rd[f64, (100000, 0.0)] 1,024.4 µs 845.6 µs +21.15%
Simulation decompress_rd[f32, (100000, 0.0)] 586.7 µs 499.1 µs +17.54%
Simulation bitwise_not_vortex_buffer_mut[128] 244.4 ns 215.3 ns +13.55%
Simulation bitwise_not_vortex_buffer_mut[1024] 304.7 ns 275.6 ns +10.58%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing adamg/df-minor-touchup-20260611 (1d7b1bd) with develop (1535ced)

Open in CodSpeed

@joseph-isaacs joseph-isaacs marked this pull request as draft June 11, 2026 13:07
let footer_cache_hit = cached_footer.is_some();

if let Some(footer) = cached_footer {
open_opts = open_opts.with_footer(footer);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now has a cache!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where?

@AdamGS AdamGS marked this pull request as ready for review June 11, 2026 13:13

@robert3005 robert3005 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does Joe mean?

@AdamGS AdamGS merged commit eda4dd0 into develop Jun 11, 2026
98 of 100 checks passed
@AdamGS AdamGS deleted the adamg/df-minor-touchup-20260611 branch June 11, 2026 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change ext/datafusion Relates to the DataFusion integration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants