perfect hash join #19411

UBarney · 2025-12-19T15:07:15Z

Which issue does this PR close?

Closes [EPIC]: Perfect Hash Join #17635.

Rationale for this change

This PR introduces a Perfect Hash Join optimization by using an array-based direct mapping(ArrayMap) instead of a HashMap.
The array-based approach outperforms the standard Hash Join when the build-side keys are dense (i.e., the ratio of count / (max - min+1) is high) or when the key range (max - min) is sufficiently small.

The following results from the hj.rs benchmark suite. The benchmark was executed with the optimization enabled by setting DATAFUSION_EXECUTION_PERFECT_HASH_JOIN_MIN_KEY_DENSITY=0.1


┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query                                                  ┃   base_hj ┃ density=0.1 ┃        Change ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1_density=1_prob_hit=1_25*1.5M                  │   5.50 ms │     4.54 ms │ +1.21x faster │
│ QQuery 2_density=0.026_prob_hit=1_25*1.5M              │   6.13 ms │     5.43 ms │ +1.13x faster │
│ QQuery 3_density=1_prob_hit=1_100K*60M                 │ 132.59 ms │    97.42 ms │ +1.36x faster │
│ QQuery 4_density=1_prob_hit=0.1_100K*60M               │ 146.66 ms │    97.75 ms │ +1.50x faster │
│ QQuery 5_density=0.75_prob_hit=1_100K*60M              │ 139.85 ms │   103.82 ms │ +1.35x faster │
│ QQuery 6_density=0.75_prob_hit=0.1_100K*60M            │ 256.62 ms │   192.15 ms │ +1.34x faster │
│ QQuery 7_density=0.5_prob_hit=1_100K*60M               │ 136.27 ms │    91.64 ms │ +1.49x faster │
│ QQuery 8_density=0.5_prob_hit=0.1_100K*60M             │ 234.89 ms │   185.35 ms │ +1.27x faster │
│ QQuery 9_density=0.2_prob_hit=1_100K*60M               │ 132.76 ms │    98.44 ms │ +1.35x faster │
│ QQuery 10_density=0.2_prob_hit=0.1_100K*60M            │ 240.04 ms │   184.93 ms │ +1.30x faster │
│ QQuery 11_density=0.1_prob_hit=1_100K*60M              │ 133.02 ms │   108.11 ms │ +1.23x faster │
│ QQuery 12_density=0.1_prob_hit=0.1_100K*60M            │ 235.44 ms │   209.10 ms │ +1.13x faster │
│ QQuery 13_density=0.01_prob_hit=1_100K*60M             │ 135.64 ms │   132.52 ms │     no change │
│ QQuery 14_density=0.01_prob_hit=0.1_100K*60M           │ 235.88 ms │   234.62 ms │     no change │
│ QQuery 15_density=0.2_prob_hit=0.1_100K_(20%_dups)*60M │ 178.49 ms │   147.55 ms │ +1.21x faster │
└────────────────────────────────────────────────────────┴───────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (base_hj)       │ 2349.79ms │
│ Total Time (density=0.1)   │ 1893.37ms │
│ Average Time (base_hj)     │  156.65ms │
│ Average Time (density=0.1) │  126.22ms │
│ Queries Faster             │        13 │
│ Queries Slower             │         0 │
│ Queries with No Change     │         2 │
│ Queries with Failure       │         0 │
└────────────────────────────┴───────────┘

The following results from tpch-sf10

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       base ┃ perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  739.66 ms │  743.84 ms │     no change │
│ QQuery 2     │  315.94 ms │  317.53 ms │     no change │
│ QQuery 3     │  655.79 ms │  669.24 ms │     no change │
│ QQuery 4     │  215.48 ms │  218.79 ms │     no change │
│ QQuery 5     │ 1131.42 ms │ 1146.03 ms │     no change │
│ QQuery 6     │  202.32 ms │  190.83 ms │ +1.06x faster │
│ QQuery 7     │ 1734.06 ms │ 1710.50 ms │     no change │
│ QQuery 8     │ 1185.05 ms │ 1173.90 ms │     no change │
│ QQuery 9     │ 2036.76 ms │ 1994.30 ms │     no change │
│ QQuery 10    │  907.32 ms │  893.20 ms │     no change │
│ QQuery 11    │  306.63 ms │  275.46 ms │ +1.11x faster │
│ QQuery 12    │  404.00 ms │  381.95 ms │ +1.06x faster │
│ QQuery 13    │  531.67 ms │  498.58 ms │ +1.07x faster │
│ QQuery 14    │  317.63 ms │  303.04 ms │     no change │
│ QQuery 15    │  602.24 ms │  572.18 ms │     no change │
│ QQuery 16    │  200.00 ms │  201.68 ms │     no change │
│ QQuery 17    │ 1848.67 ms │ 1790.60 ms │     no change │
│ QQuery 18    │ 2130.63 ms │ 2179.84 ms │     no change │
│ QQuery 19    │  501.81 ms │  529.85 ms │  1.06x slower │
│ QQuery 20    │  637.91 ms │  661.72 ms │     no change │
│ QQuery 21    │ 1882.43 ms │ 1917.10 ms │     no change │
│ QQuery 22    │  130.68 ms │  141.76 ms │  1.08x slower │
└──────────────┴────────────┴────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (base)         │ 18618.10ms │
│ Total Time (perfect_hj)   │ 18511.93ms │
│ Average Time (base)       │   846.28ms │
│ Average Time (perfect_hj) │   841.45ms │
│ Queries Faster            │          4 │
│ Queries Slower            │          2 │
│ Queries with No Change    │         16 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘

What changes are included in this PR?

During the collect_left_input (build) phase, we now conditionally use an ArrayMap instead of a standard JoinHashMapType. This optimization is triggered only when the following conditions are met:
- There is exactly one join key.
- The join key can be any integer type convertible to u64 (excluding i128 and u128).
- The key distribution is sufficiently dense or the key range (max - min) is small enough to justify an array-based allocation.
- build_side.num_rows() < u32::MAX
The ArrayMap works by storing the minimum key as an offset and using a Vec to directly map a key k to its build-side index via data[k- offset].
Rewrite Hash Join micro-benchmarks in benchmarks/src/hj.rs to evaluate ArrayMap and
HashMap performance across varying key densities and probe hit rates

Are these changes tested?

Yes

Are there any user-facing changes?

Yes, this PR introduces two new session configuration parameters to control the behavior of the Perfect Hash Join optimization:

perfect_hash_join_small_build_threshold: This parameter defines the maximum key range (max_key - min_key) for the build side to be considered "small." If the key range is below this threshold, the array-based join will be triggered regardless of key density.
perfect_hash_join_min_key_density: This parameter sets the minimum density (row_count / key_range) required to enable the perfect hash join optimization for larger key ranges

Dandandan · 2025-12-19T18:40:41Z

datafusion/physical-plan/src/joins/array_map.rs

+    ) -> Result<Self> {
+        // Initialize with 0 (sentinel for not found)
+        let mut data: Vec<u32> = vec![0; range];
+        let mut next: Option<Vec<u32>> = None;


Suggested change

let mut next: Option<Vec<u32>> = None;

let mut next: Vec<u32> = vec![];

I think this should work as well

Done, it's cleaner this way

Dandandan · 2025-12-19T18:54:58Z

run benchmarks

alamb-ghbot · 2025-12-19T18:55:06Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing perfect_hj (a442460) to 8550010 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

Dandandan · 2025-12-19T18:55:07Z

run benchmark tpcds

Dandandan · 2025-12-19T19:00:15Z

datafusion/common/src/config.rs

+        /// 
+        /// TODO: Currently only supports cases where left_side.num_rows() < u32::MAX.
+        /// Support for left_side.num_rows() >= u32::MAX will be added in the future.
+        pub perfect_hash_join_min_key_density: f64, default = 0.99


This seems very high? For a hashmap I believe it's ~75% default (plus it has some more overhead per key), so I think a 75% probably could still be better overall?

That's a great point.
I'll add some benchmarks to compare the performance at different densities, including 75%, to find the optimal value for our use case. I'll update this based on the results.
Thanks for the suggestion!

Sounds good!

I think we can set it to 20%.

In terms of memory usage (run with this code: ), ArrayMap consumes less memory than JoinHashMap at a 20% density, even with duplicate keys.

Based on the hj.rs benchmark(results in the PR description), ArrayMap is also faster than JoinHashMap at 20% density, regardless of whether there are duplicate keys.

Memory Comparison Matrix (num_rows = 1000000) | Density | Dup Rate | ArrayMap (MB) | JoinHashMap (MB) | Ratio (AM/JHM) | |---------|----------|---------------|------------------|----------------| | 100% | 0% | 3.81 | 37.81 | 0.10x | | 100% | 25% | 7.63 | 37.81 | 0.20x | | 100% | 50% | 7.63 | 37.81 | 0.20x | | 100% | 75% | 7.63 | 37.81 | 0.20x | | 75% | 0% | 5.09 | 37.81 | 0.13x | | 75% | 25% | 8.90 | 37.81 | 0.24x | | 75% | 50% | 8.90 | 37.81 | 0.24x | | 75% | 75% | 8.90 | 37.81 | 0.24x | | 50% | 0% | 7.63 | 37.81 | 0.20x | | 50% | 25% | 11.44 | 37.81 | 0.30x | | 50% | 50% | 11.44 | 37.81 | 0.30x | | 50% | 75% | 11.44 | 37.81 | 0.30x | | 20% | 0% | 19.07 | 37.81 | 0.50x | | 20% | 25% | 22.89 | 37.81 | 0.61x | | 20% | 50% | 22.89 | 37.81 | 0.61x | | 20% | 75% | 22.89 | 37.81 | 0.61x | | 10% | 0% | 38.15 | 37.81 | 1.01x | | 10% | 25% | 41.96 | 37.81 | 1.11x | | 10% | 50% | 41.96 | 37.81 | 1.11x | | 10% | 75% | 41.96 | 37.81 | 1.11x |

alamb-ghbot · 2025-12-19T19:34:36Z

🤖: Benchmark completed

Details

Comparing HEAD and perfect_hj
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃  perfect_hj ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2728.43 ms │  2746.77 ms │ no change │
│ QQuery 1     │  1259.76 ms │  1286.32 ms │ no change │
│ QQuery 2     │  2442.97 ms │  2404.14 ms │ no change │
│ QQuery 3     │  1146.01 ms │  1175.40 ms │ no change │
│ QQuery 4     │  2324.58 ms │  2297.95 ms │ no change │
│ QQuery 5     │ 28847.71 ms │ 28651.86 ms │ no change │
│ QQuery 6     │  3925.44 ms │  4071.92 ms │ no change │
│ QQuery 7     │  3480.92 ms │  3513.25 ms │ no change │
└──────────────┴─────────────┴─────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 46155.84ms │
│ Total Time (perfect_hj)   │ 46147.61ms │
│ Average Time (HEAD)       │  5769.48ms │
│ Average Time (perfect_hj) │  5768.45ms │
│ Queries Faster            │          0 │
│ Queries Slower            │          0 │
│ Queries with No Change    │          8 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃  perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.55 ms │     2.28 ms │ +1.12x faster │
│ QQuery 1     │    49.94 ms │    49.72 ms │     no change │
│ QQuery 2     │   136.75 ms │   135.77 ms │     no change │
│ QQuery 3     │   157.55 ms │   156.38 ms │     no change │
│ QQuery 4     │  1127.59 ms │  1099.95 ms │     no change │
│ QQuery 5     │  1513.28 ms │  1540.48 ms │     no change │
│ QQuery 6     │     2.09 ms │     2.18 ms │     no change │
│ QQuery 7     │    54.92 ms │    54.54 ms │     no change │
│ QQuery 8     │  1435.36 ms │  1429.02 ms │     no change │
│ QQuery 9     │  1826.50 ms │  1818.96 ms │     no change │
│ QQuery 10    │   362.05 ms │   359.18 ms │     no change │
│ QQuery 11    │   413.64 ms │   415.45 ms │     no change │
│ QQuery 12    │  1346.02 ms │  1355.55 ms │     no change │
│ QQuery 13    │  2003.64 ms │  2030.93 ms │     no change │
│ QQuery 14    │  1258.62 ms │  1264.24 ms │     no change │
│ QQuery 15    │  1236.44 ms │  1220.63 ms │     no change │
│ QQuery 16    │  2660.06 ms │  2679.40 ms │     no change │
│ QQuery 17    │  2646.48 ms │  2669.57 ms │     no change │
│ QQuery 18    │  5032.89 ms │  5243.53 ms │     no change │
│ QQuery 19    │   119.08 ms │   120.08 ms │     no change │
│ QQuery 20    │  1909.83 ms │  1955.97 ms │     no change │
│ QQuery 21    │  2219.56 ms │  2228.45 ms │     no change │
│ QQuery 22    │  3754.10 ms │  3742.17 ms │     no change │
│ QQuery 23    │ 12284.83 ms │ 12304.70 ms │     no change │
│ QQuery 24    │   200.89 ms │   216.80 ms │  1.08x slower │
│ QQuery 25    │   459.79 ms │   477.52 ms │     no change │
│ QQuery 26    │   217.46 ms │   226.68 ms │     no change │
│ QQuery 27    │  2733.98 ms │  2714.81 ms │     no change │
│ QQuery 28    │ 24706.02 ms │ 23446.26 ms │ +1.05x faster │
│ QQuery 29    │   983.86 ms │   951.63 ms │     no change │
│ QQuery 30    │  1325.66 ms │  1334.96 ms │     no change │
│ QQuery 31    │  1361.59 ms │  1319.42 ms │     no change │
│ QQuery 32    │  4654.91 ms │  4894.57 ms │  1.05x slower │
│ QQuery 33    │  5895.01 ms │  5580.74 ms │ +1.06x faster │
│ QQuery 34    │  5921.45 ms │  5968.72 ms │     no change │
│ QQuery 35    │  1944.76 ms │  1894.39 ms │     no change │
│ QQuery 36    │    66.49 ms │    65.88 ms │     no change │
│ QQuery 37    │    45.71 ms │    47.05 ms │     no change │
│ QQuery 38    │    66.55 ms │    67.85 ms │     no change │
│ QQuery 39    │   105.16 ms │   104.80 ms │     no change │
│ QQuery 40    │    28.03 ms │    27.95 ms │     no change │
│ QQuery 41    │    23.88 ms │    24.08 ms │     no change │
│ QQuery 42    │    19.99 ms │    20.73 ms │     no change │
└──────────────┴─────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 94314.96ms │
│ Total Time (perfect_hj)   │ 93263.97ms │
│ Average Time (HEAD)       │  2193.37ms │
│ Average Time (perfect_hj) │  2168.93ms │
│ Queries Faster            │          3 │
│ Queries Slower            │          2 │
│ Queries with No Change    │         38 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 139.34 ms │  130.00 ms │ +1.07x faster │
│ QQuery 2     │  27.17 ms │   23.45 ms │ +1.16x faster │
│ QQuery 3     │  40.09 ms │   38.79 ms │     no change │
│ QQuery 4     │  29.24 ms │   28.73 ms │     no change │
│ QQuery 5     │  89.41 ms │   88.45 ms │     no change │
│ QQuery 6     │  20.11 ms │   19.98 ms │     no change │
│ QQuery 7     │ 233.64 ms │  193.84 ms │ +1.21x faster │
│ QQuery 8     │  40.29 ms │   38.67 ms │     no change │
│ QQuery 9     │ 108.11 ms │  102.06 ms │ +1.06x faster │
│ QQuery 10    │  64.69 ms │   64.74 ms │     no change │
│ QQuery 11    │  18.64 ms │   11.46 ms │ +1.63x faster │
│ QQuery 12    │  51.76 ms │   49.96 ms │     no change │
│ QQuery 13    │  47.87 ms │   48.78 ms │     no change │
│ QQuery 14    │  14.03 ms │   14.30 ms │     no change │
│ QQuery 15    │  25.20 ms │   25.20 ms │     no change │
│ QQuery 16    │  25.60 ms │   24.53 ms │     no change │
│ QQuery 17    │ 156.37 ms │  151.59 ms │     no change │
│ QQuery 18    │ 283.58 ms │  283.26 ms │     no change │
│ QQuery 19    │  35.96 ms │   38.98 ms │  1.08x slower │
│ QQuery 20    │  48.62 ms │   50.36 ms │     no change │
│ QQuery 21    │ 312.85 ms │  309.67 ms │     no change │
│ QQuery 22    │  17.67 ms │   18.01 ms │     no change │
└──────────────┴───────────┴────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary         ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 1830.21ms │
│ Total Time (perfect_hj)   │ 1754.80ms │
│ Average Time (HEAD)       │   83.19ms │
│ Average Time (perfect_hj) │   79.76ms │
│ Queries Faster            │         5 │
│ Queries Slower            │         1 │
│ Queries with No Change    │        16 │
│ Queries with Failure      │         0 │
└───────────────────────────┴───────────┘

alamb-ghbot · 2025-12-19T19:34:41Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing perfect_hj (a442460) to 8550010 diff using: tpcds
Results will be posted here when complete

alamb-ghbot · 2025-12-19T19:43:35Z

🤖: Benchmark completed

Details

Comparing HEAD and perfect_hj
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃  perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │    64.32 ms │    64.03 ms │     no change │
│ QQuery 2     │   202.76 ms │   209.93 ms │     no change │
│ QQuery 3     │   160.17 ms │   160.30 ms │     no change │
│ QQuery 4     │  2076.33 ms │  2016.14 ms │     no change │
│ QQuery 5     │   271.72 ms │   267.86 ms │     no change │
│ QQuery 6     │  1539.80 ms │  1528.76 ms │     no change │
│ QQuery 7     │   501.34 ms │   500.09 ms │     no change │
│ QQuery 8     │   172.26 ms │   172.88 ms │     no change │
│ QQuery 9     │   283.58 ms │   274.82 ms │     no change │
│ QQuery 10    │   169.65 ms │   172.34 ms │     no change │
│ QQuery 11    │  1414.43 ms │  1436.15 ms │     no change │
│ QQuery 12    │    74.91 ms │    73.81 ms │     no change │
│ QQuery 13    │   554.37 ms │   557.49 ms │     no change │
│ QQuery 14    │  2016.91 ms │  1803.46 ms │ +1.12x faster │
│ QQuery 15    │    28.50 ms │    27.30 ms │     no change │
│ QQuery 16    │    56.60 ms │    57.60 ms │     no change │
│ QQuery 17    │   354.64 ms │   357.88 ms │     no change │
│ QQuery 18    │   191.69 ms │   190.50 ms │     no change │
│ QQuery 19    │   228.28 ms │   229.67 ms │     no change │
│ QQuery 20    │    23.36 ms │    23.50 ms │     no change │
│ QQuery 21    │    33.71 ms │    32.15 ms │     no change │
│ QQuery 22    │   985.29 ms │   906.05 ms │ +1.09x faster │
│ QQuery 23    │  1845.20 ms │  1713.30 ms │ +1.08x faster │
│ QQuery 24    │   642.89 ms │   636.64 ms │     no change │
│ QQuery 25    │   523.12 ms │   516.07 ms │     no change │
│ QQuery 26    │   123.94 ms │   126.15 ms │     no change │
│ QQuery 27    │   504.93 ms │   500.48 ms │     no change │
│ QQuery 28    │   303.81 ms │   302.04 ms │     no change │
│ QQuery 29    │   444.98 ms │   447.70 ms │     no change │
│ QQuery 30    │    65.01 ms │    63.94 ms │     no change │
│ QQuery 31    │   306.15 ms │   306.66 ms │     no change │
│ QQuery 32    │    77.86 ms │    78.00 ms │     no change │
│ QQuery 33    │   193.25 ms │   194.57 ms │     no change │
│ QQuery 34    │   159.04 ms │   162.10 ms │     no change │
│ QQuery 35    │   169.34 ms │   171.45 ms │     no change │
│ QQuery 36    │   296.09 ms │   286.26 ms │     no change │
│ QQuery 37    │   257.06 ms │   254.49 ms │     no change │
│ QQuery 38    │   155.26 ms │   151.49 ms │     no change │
│ QQuery 39    │   216.46 ms │   194.61 ms │ +1.11x faster │
│ QQuery 40    │   197.40 ms │   172.47 ms │ +1.14x faster │
│ QQuery 41    │    16.30 ms │    16.66 ms │     no change │
│ QQuery 42    │   142.70 ms │   145.52 ms │     no change │
│ QQuery 43    │   126.26 ms │   129.13 ms │     no change │
│ QQuery 44    │    15.68 ms │    14.96 ms │     no change │
│ QQuery 45    │    83.14 ms │    80.86 ms │     no change │
│ QQuery 46    │   329.53 ms │   329.04 ms │     no change │
│ QQuery 47    │  1279.95 ms │  1186.17 ms │ +1.08x faster │
│ QQuery 48    │   416.35 ms │   420.97 ms │     no change │
│ QQuery 49    │   355.14 ms │   354.37 ms │     no change │
│ QQuery 50    │   338.48 ms │   343.17 ms │     no change │
│ QQuery 51    │   298.83 ms │   294.59 ms │     no change │
│ QQuery 52    │   147.14 ms │   143.27 ms │     no change │
│ QQuery 53    │   149.53 ms │   146.40 ms │     no change │
│ QQuery 54    │   207.43 ms │   216.19 ms │     no change │
│ QQuery 55    │   144.21 ms │   144.60 ms │     no change │
│ QQuery 56    │   193.16 ms │   190.45 ms │     no change │
│ QQuery 57    │   319.98 ms │   302.28 ms │ +1.06x faster │
│ QQuery 58    │   515.69 ms │   482.75 ms │ +1.07x faster │
│ QQuery 59    │   285.93 ms │   288.57 ms │     no change │
│ QQuery 60    │   199.98 ms │   199.54 ms │     no change │
│ QQuery 61    │   240.08 ms │   235.47 ms │     no change │
│ QQuery 62    │  1378.44 ms │  1398.12 ms │     no change │
│ QQuery 63    │   149.43 ms │   148.88 ms │     no change │
│ QQuery 64    │  1142.94 ms │  1161.26 ms │     no change │
│ QQuery 65    │   368.38 ms │   363.41 ms │     no change │
│ QQuery 66    │   389.40 ms │   393.14 ms │     no change │
│ QQuery 67    │   646.76 ms │   627.61 ms │     no change │
│ QQuery 68    │   383.87 ms │   384.78 ms │     no change │
│ QQuery 69    │   167.41 ms │   170.00 ms │     no change │
│ QQuery 70    │   526.93 ms │   528.35 ms │     no change │
│ QQuery 71    │   183.79 ms │   184.70 ms │     no change │
│ QQuery 72    │  2474.00 ms │  2083.02 ms │ +1.19x faster │
│ QQuery 73    │   156.03 ms │   157.55 ms │     no change │
│ QQuery 74    │   893.90 ms │   899.19 ms │     no change │
│ QQuery 75    │   394.28 ms │   389.15 ms │     no change │
│ QQuery 76    │   184.12 ms │   182.03 ms │     no change │
│ QQuery 77    │   265.99 ms │   262.24 ms │     no change │
│ QQuery 78    │   939.94 ms │   953.61 ms │     no change │
│ QQuery 79    │   341.30 ms │   340.35 ms │     no change │
│ QQuery 80    │   496.20 ms │   493.37 ms │     no change │
│ QQuery 81    │    43.41 ms │    40.77 ms │ +1.06x faster │
│ QQuery 82    │   296.71 ms │   283.45 ms │     no change │
│ QQuery 83    │    67.95 ms │    62.11 ms │ +1.09x faster │
│ QQuery 84    │    64.72 ms │    65.66 ms │     no change │
│ QQuery 85    │   214.94 ms │   226.66 ms │  1.05x slower │
│ QQuery 86    │    60.18 ms │    57.66 ms │     no change │
│ QQuery 87    │   151.20 ms │   157.81 ms │     no change │
│ QQuery 88    │   243.17 ms │   250.08 ms │     no change │
│ QQuery 89    │   174.02 ms │   172.62 ms │     no change │
│ QQuery 90    │    36.80 ms │    37.36 ms │     no change │
│ QQuery 91    │    95.95 ms │    92.97 ms │     no change │
│ QQuery 92    │    78.25 ms │    77.30 ms │     no change │
│ QQuery 93    │   270.50 ms │   263.82 ms │     no change │
│ QQuery 94    │    85.61 ms │    84.43 ms │     no change │
│ QQuery 95    │   260.54 ms │   228.76 ms │ +1.14x faster │
│ QQuery 96    │   111.78 ms │   110.86 ms │     no change │
│ QQuery 97    │   186.45 ms │   187.03 ms │     no change │
│ QQuery 98    │   235.54 ms │   237.35 ms │     no change │
│ QQuery 99    │ 14895.30 ms │ 15020.29 ms │     no change │
└──────────────┴─────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 52748.01ms │
│ Total Time (perfect_hj)   │ 51783.83ms │
│ Average Time (HEAD)       │   532.81ms │
│ Average Time (perfect_hj) │   523.07ms │
│ Queries Faster            │         12 │
│ Queries Slower            │          1 │
│ Queries with No Change    │         86 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘

UBarney · 2025-12-21T14:22:54Z

benchmarks/compare.py

    comparison = BenchmarkRun.load_from_file(comparison_path)

-    console = Console()
+    console = Console(width=200)


I've increased the console width to 200. I added more information like 'density' to the queryName, which made it longer and caused it to be cut off in the output before

UBarney · 2025-12-21T14:22:56Z

benchmarks/compare.py

-    baseline_header = baseline_path.parent.stem
-    comparison_header = comparison_path.parent.stem
+    baseline_header = baseline_path.parent.name
+    comparison_header = comparison_path.parent.name


Before, a path like .../density=0.1/... was incorrectly shortened to density=0. Now, by using .parent.name, we correctly get the full directory name, density=0.1

github-actions bot added common Related to common crate physical-plan Changes to the physical-plan crate labels Dec 19, 2025

Dandandan reviewed Dec 19, 2025

View reviewed changes

UBarney force-pushed the perfect_hj branch from a442460 to 31efb2d Compare December 21, 2025 14:15

UBarney commented Dec 21, 2025

View reviewed changes

github-actions bot added proto Related to proto crate documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) labels Dec 23, 2025

UBarney force-pushed the perfect_hj branch 4 times, most recently from 8044bce to 954f1df Compare December 28, 2025 08:02

UBarney marked this pull request as ready for review December 28, 2025 08:02

feat: support perfect hash join

35018b6

UBarney force-pushed the perfect_hj branch from 954f1df to 35018b6 Compare December 31, 2025 06:48

	let mut next: Option<Vec<u32>> = None;
	let mut next: Vec<u32> = vec![];

perfect hash join #19411

Are you sure you want to change the base?

perfect hash join #19411

Conversation

UBarney commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Dandandan Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

UBarney Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Dec 19, 2025

Uh oh!

alamb-ghbot commented Dec 19, 2025

Uh oh!

Dandandan commented Dec 19, 2025

Uh oh!

Dandandan Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

UBarney Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Dandandan Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

UBarney Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb-ghbot commented Dec 19, 2025

Uh oh!

alamb-ghbot commented Dec 19, 2025

Uh oh!

alamb-ghbot commented Dec 19, 2025

Uh oh!

UBarney Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

UBarney Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UBarney commented Dec 19, 2025 •

edited

Loading

UBarney Dec 31, 2025 •

edited

Loading