Upgrade DataFusion to 54#8044
Conversation
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | compare[63] |
244.6 µs | 360.3 µs | -32.11% |
| ❌ | Simulation | compare[62] |
254.7 µs | 368.5 µs | -30.89% |
| ❌ | Simulation | compare[56] |
229.5 µs | 332 µs | -30.89% |
| ❌ | Simulation | compare[60] |
248.3 µs | 358.4 µs | -30.71% |
| ❌ | Simulation | compare[61] |
255.4 µs | 367.3 µs | -30.46% |
| ❌ | Simulation | compare[58] |
245.5 µs | 351.9 µs | -30.24% |
| ❌ | Simulation | compare[59] |
250.8 µs | 359 µs | -30.14% |
| ❌ | Simulation | compare[57] |
246.1 µs | 350.5 µs | -29.79% |
| ❌ | Simulation | compare[54] |
236.2 µs | 335.1 µs | -29.51% |
| ❌ | Simulation | compare[55] |
241.1 µs | 341.9 µs | -29.47% |
| ❌ | Simulation | compare[52] |
230 µs | 325.1 µs | -29.27% |
| ❌ | Simulation | compare[48] |
212.2 µs | 300 µs | -29.24% |
| ❌ | Simulation | compare[53] |
236 µs | 333 µs | -29.13% |
| ❌ | Simulation | compare[50] |
226.9 µs | 318.4 µs | -28.72% |
| ❌ | Simulation | compare[51] |
232 µs | 325.3 µs | -28.68% |
| ❌ | Simulation | compare[49] |
227.5 µs | 317.1 µs | -28.25% |
| ❌ | Simulation | compare[46] |
217.9 µs | 301.9 µs | -27.81% |
| ❌ | Simulation | compare[47] |
222.8 µs | 308.6 µs | -27.81% |
| ❌ | Simulation | compare[44] |
211.3 µs | 291.5 µs | -27.53% |
| ❌ | Simulation | compare[45] |
218.2 µs | 300.3 µs | -27.34% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing adamg/df-54 (585236e) with develop (5e3aedb)
Polar Signals Profiling ResultsLatest Run
Previous Runs (4)
Powered by Polar Signals Cloud |
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.053x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.053x ➖, 0↑ 3↓)
No file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.006x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.993x ➖, 0↑ 0↓)
datafusion / parquet (0.996x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.031x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.011x ➖, 0↑ 0↓)
duckdb / parquet (1.007x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.991x ➖, 1↑ 1↓)
datafusion / vortex-compact (1.012x ➖, 1↑ 3↓)
datafusion / parquet (1.025x ➖, 1↑ 2↓)
datafusion / arrow (0.991x ➖, 3↑ 3↓)
duckdb / vortex-file-compressed (1.010x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.008x ➖, 0↑ 0↓)
duckdb / parquet (1.003x ➖, 2↑ 2↓)
duckdb / duckdb (1.003x ➖, 0↑ 0↓)
File Size Changes (10 files changed, +0.1% overall, 6↑ 4↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.967x ➖, 11↑ 4↓)
datafusion / vortex-compact (0.966x ➖, 9↑ 3↓)
datafusion / parquet (0.968x ➖, 9↑ 4↓)
duckdb / vortex-file-compressed (0.991x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.984x ➖, 3↑ 2↓)
duckdb / parquet (1.000x ➖, 1↑ 0↓)
duckdb / duckdb (0.983x ➖, 3↑ 0↓)
File Size Changes (5 files changed, +0.0% overall, 4↑ 1↓)
Totals:
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.016x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.011x ➖, 0↑ 0↓)
duckdb / parquet (1.027x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.142x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.124x ➖, 0↑ 2↓)
datafusion / parquet (1.277x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.017x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.988x ➖, 0↑ 0↓)
duckdb / parquet (1.039x ➖, 0↑ 0↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.945x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.960x ➖, 4↑ 1↓)
datafusion / parquet (1.015x ➖, 0↑ 2↓)
datafusion / arrow (1.029x ➖, 8↑ 10↓)
duckdb / vortex-file-compressed (0.997x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.998x ➖, 0↑ 0↓)
duckdb / parquet (0.998x ➖, 0↑ 0↓)
duckdb / duckdb (0.998x ➖, 0↑ 0↓)
File Size Changes (27 files changed, +0.0% overall, 13↑ 14↓)
Totals:
|
File Sizes: TPC-H SF=10 on NVMEFile Size Changes (48 files changed, -0.0% overall, 0↑ 48↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.931x ➖, 10↑ 7↓)
datafusion / parquet (0.877x ✅, 14↑ 3↓)
duckdb / vortex-file-compressed (1.078x ➖, 0↑ 19↓)
duckdb / parquet (1.043x ➖, 1↑ 4↓)
duckdb / duckdb (1.040x ➖, 1↑ 2↓)
File Size Changes (106 files changed, -0.0% overall, 44↑ 62↓)
Totals:
|
File Sizes: Clickbench on NVMEFile Size Changes (201 files changed, -0.0% overall, 0↑ 201↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.170x ➖, 0↑ 7↓)
datafusion / vortex-compact (1.119x ➖, 0↑ 7↓)
datafusion / parquet (1.098x ➖, 0↑ 4↓)
duckdb / vortex-file-compressed (1.017x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.015x ➖, 0↑ 0↓)
duckdb / parquet (1.031x ➖, 0↑ 0↓)
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.014x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.042x ➖, 0↑ 2↓)
datafusion / parquet (1.066x ➖, 0↑ 4↓)
duckdb / vortex-file-compressed (1.135x ➖, 0↑ 1↓)
duckdb / vortex-compact (1.237x ➖, 0↑ 4↓)
duckdb / parquet (1.093x ➖, 0↑ 0↓)
|
Benchmarks: Random AccessVortex (geomean): 1.231x ❌ How to read Verdict and Engines
unknown / unknown (1.205x ❌, 0↑ 33↓)
|
Benchmarks: CompressionVortex (geomean): 1.001x ➖ How to read Verdict and Engines
unknown / unknown (1.018x ➖, 1↑ 5↓)
|
Benchmarks: Appian on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.015x ➖, 1↑ 1↓)
datafusion / parquet (1.029x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (1.010x ➖, 0↑ 0↓)
duckdb / parquet (1.009x ➖, 0↑ 0↓)
duckdb / duckdb (1.007x ➖, 0↑ 0↓)
File Size Changes (4 files changed, -0.0% overall, 0↑ 4↓)
Totals:
|
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
| return Ok(cast(child, cast_dtype)); | ||
| } | ||
|
|
||
| if let Some(cast_col_expr) = df.as_any().downcast_ref::<df_expr::CastColumnExpr>() { |
There was a problem hiding this comment.
why can we remove the cast expressions?
There was a problem hiding this comment.
CastColumn was removed, and its functionality was merged into Cast.
Summary
This PR includes an upgrade of our DataFusion dependency/integration to the upcoming 54 release. It aims to make the minimal amount of changes, and implementing the new
MorselizerAPI will be part of a future PR (I have an old PR that was based on an earlier PoC, I'll try and pull stuff from there when the time comes).54.0.0(Apr 2026 / May 2026) apache/datafusion#21080