Skip to content

Conversation

@UBarney
Copy link
Contributor

@UBarney UBarney commented Dec 19, 2025

Which issue does this PR close?

Rationale for this change

This PR introduces a Perfect Hash Join optimization by using an array-based direct mapping(ArrayMap) instead of a HashMap.
The array-based approach outperforms the standard Hash Join when the build-side keys are dense (i.e., the ratio of count / (max - min+1) is high) or when the key range (max - min) is sufficiently small.

The following results from the hj.rs benchmark suite. The benchmark was executed with the optimization enabled by setting DATAFUSION_EXECUTION_PERFECT_HASH_JOIN_MIN_KEY_DENSITY=0.1


┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query                                                  ┃   base_hj ┃ density=0.1 ┃        Change ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1_density=1_prob_hit=1_25*1.5M                  │   5.50 ms │     4.54 ms │ +1.21x faster │
│ QQuery 2_density=0.026_prob_hit=1_25*1.5M              │   6.13 ms │     5.43 ms │ +1.13x faster │
│ QQuery 3_density=1_prob_hit=1_100K*60M                 │ 132.59 ms │    97.42 ms │ +1.36x faster │
│ QQuery 4_density=1_prob_hit=0.1_100K*60M               │ 146.66 ms │    97.75 ms │ +1.50x faster │
│ QQuery 5_density=0.75_prob_hit=1_100K*60M              │ 139.85 ms │   103.82 ms │ +1.35x faster │
│ QQuery 6_density=0.75_prob_hit=0.1_100K*60M            │ 256.62 ms │   192.15 ms │ +1.34x faster │
│ QQuery 7_density=0.5_prob_hit=1_100K*60M               │ 136.27 ms │    91.64 ms │ +1.49x faster │
│ QQuery 8_density=0.5_prob_hit=0.1_100K*60M             │ 234.89 ms │   185.35 ms │ +1.27x faster │
│ QQuery 9_density=0.2_prob_hit=1_100K*60M               │ 132.76 ms │    98.44 ms │ +1.35x faster │
│ QQuery 10_density=0.2_prob_hit=0.1_100K*60M            │ 240.04 ms │   184.93 ms │ +1.30x faster │
│ QQuery 11_density=0.1_prob_hit=1_100K*60M              │ 133.02 ms │   108.11 ms │ +1.23x faster │
│ QQuery 12_density=0.1_prob_hit=0.1_100K*60M            │ 235.44 ms │   209.10 ms │ +1.13x faster │
│ QQuery 13_density=0.01_prob_hit=1_100K*60M             │ 135.64 ms │   132.52 ms │     no change │
│ QQuery 14_density=0.01_prob_hit=0.1_100K*60M           │ 235.88 ms │   234.62 ms │     no change │
│ QQuery 15_density=0.2_prob_hit=0.1_100K_(20%_dups)*60M │ 178.49 ms │   147.55 ms │ +1.21x faster │
└────────────────────────────────────────────────────────┴───────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (base_hj)       │ 2349.79ms │
│ Total Time (density=0.1)   │ 1893.37ms │
│ Average Time (base_hj)     │  156.65ms │
│ Average Time (density=0.1) │  126.22ms │
│ Queries Faster             │        13 │
│ Queries Slower             │         0 │
│ Queries with No Change     │         2 │
│ Queries with Failure       │         0 │
└────────────────────────────┴───────────┘

The following results from tpch-sf10

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       base ┃ perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  739.66 ms │  743.84 ms │     no change │
│ QQuery 2     │  315.94 ms │  317.53 ms │     no change │
│ QQuery 3     │  655.79 ms │  669.24 ms │     no change │
│ QQuery 4     │  215.48 ms │  218.79 ms │     no change │
│ QQuery 5     │ 1131.42 ms │ 1146.03 ms │     no change │
│ QQuery 6     │  202.32 ms │  190.83 ms │ +1.06x faster │
│ QQuery 7     │ 1734.06 ms │ 1710.50 ms │     no change │
│ QQuery 8     │ 1185.05 ms │ 1173.90 ms │     no change │
│ QQuery 9     │ 2036.76 ms │ 1994.30 ms │     no change │
│ QQuery 10    │  907.32 ms │  893.20 ms │     no change │
│ QQuery 11    │  306.63 ms │  275.46 ms │ +1.11x faster │
│ QQuery 12    │  404.00 ms │  381.95 ms │ +1.06x faster │
│ QQuery 13    │  531.67 ms │  498.58 ms │ +1.07x faster │
│ QQuery 14    │  317.63 ms │  303.04 ms │     no change │
│ QQuery 15    │  602.24 ms │  572.18 ms │     no change │
│ QQuery 16    │  200.00 ms │  201.68 ms │     no change │
│ QQuery 17    │ 1848.67 ms │ 1790.60 ms │     no change │
│ QQuery 18    │ 2130.63 ms │ 2179.84 ms │     no change │
│ QQuery 19    │  501.81 ms │  529.85 ms │  1.06x slower │
│ QQuery 20    │  637.91 ms │  661.72 ms │     no change │
│ QQuery 21    │ 1882.43 ms │ 1917.10 ms │     no change │
│ QQuery 22    │  130.68 ms │  141.76 ms │  1.08x slower │
└──────────────┴────────────┴────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (base)         │ 18618.10ms │
│ Total Time (perfect_hj)   │ 18511.93ms │
│ Average Time (base)       │   846.28ms │
│ Average Time (perfect_hj) │   841.45ms │
│ Queries Faster            │          4 │
│ Queries Slower            │          2 │
│ Queries with No Change    │         16 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘

What changes are included in this PR?

  • During the collect_left_input (build) phase, we now conditionally use an ArrayMap instead of a standard JoinHashMapType. This optimization is triggered only when the following conditions are met:
    • There is exactly one join key.
    • The join key can be any integer type convertible to u64 (excluding i128 and u128).
    • The key distribution is sufficiently dense or the key range (max - min) is small enough to justify an array-based allocation.
    • build_side.num_rows() < u32::MAX
  • The ArrayMap works by storing the minimum key as an offset and using a Vec to directly map a key k to its build-side index via data[k- offset].
  • Rewrite Hash Join micro-benchmarks in benchmarks/src/hj.rs to evaluate ArrayMap and
    HashMap performance across varying key densities and probe hit rates

Are these changes tested?

Yes

Are there any user-facing changes?

Yes, this PR introduces two new session configuration parameters to control the behavior of the Perfect Hash Join optimization:

  • perfect_hash_join_small_build_threshold: This parameter defines the maximum key range (max_key - min_key) for the build side to be considered "small." If the key range is below this threshold, the array-based join will be triggered regardless of key density.
  • perfect_hash_join_min_key_density: This parameter sets the minimum density (row_count / key_range) required to enable the perfect hash join optimization for larger key ranges

@github-actions github-actions bot added common Related to common crate physical-plan Changes to the physical-plan crate labels Dec 19, 2025
) -> Result<Self> {
// Initialize with 0 (sentinel for not found)
let mut data: Vec<u32> = vec![0; range];
let mut next: Option<Vec<u32>> = None;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut next: Option<Vec<u32>> = None;
let mut next: Vec<u32> = vec![];

I think this should work as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, it's cleaner this way

@Dandandan
Copy link
Contributor

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing perfect_hj (a442460) to 8550010 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@Dandandan
Copy link
Contributor

run benchmark tpcds

///
/// TODO: Currently only supports cases where left_side.num_rows() < u32::MAX.
/// Support for left_side.num_rows() >= u32::MAX will be added in the future.
pub perfect_hash_join_min_key_density: f64, default = 0.99
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very high? For a hashmap I believe it's ~75% default (plus it has some more overhead per key), so I think a 75% probably could still be better overall?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point.
I'll add some benchmarks to compare the performance at different densities, including 75%, to find the optimal value for our use case. I'll update this based on the results.
Thanks for the suggestion!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

This comment was marked as outdated.

Copy link
Contributor Author

@UBarney UBarney Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can set it to 20%.

  • In terms of memory usage (run with this code: ), ArrayMap consumes less memory than JoinHashMap at a 20% density, even with duplicate keys.
  • Based on the hj.rs benchmark(results in the PR description), ArrayMap is also faster than JoinHashMap at 20% density, regardless of whether there are duplicate keys.
Memory Comparison Matrix (num_rows = 1000000)
| Density | Dup Rate | ArrayMap (MB) | JoinHashMap (MB) | Ratio (AM/JHM) |
|---------|----------|---------------|------------------|----------------|
|    100% |       0% |          3.81 |            37.81 |           0.10x |
|    100% |      25% |          7.63 |            37.81 |           0.20x |
|    100% |      50% |          7.63 |            37.81 |           0.20x |
|    100% |      75% |          7.63 |            37.81 |           0.20x |
|     75% |       0% |          5.09 |            37.81 |           0.13x |
|     75% |      25% |          8.90 |            37.81 |           0.24x |
|     75% |      50% |          8.90 |            37.81 |           0.24x |
|     75% |      75% |          8.90 |            37.81 |           0.24x |
|     50% |       0% |          7.63 |            37.81 |           0.20x |
|     50% |      25% |         11.44 |            37.81 |           0.30x |
|     50% |      50% |         11.44 |            37.81 |           0.30x |
|     50% |      75% |         11.44 |            37.81 |           0.30x |
|     20% |       0% |         19.07 |            37.81 |           0.50x |
|     20% |      25% |         22.89 |            37.81 |           0.61x |
|     20% |      50% |         22.89 |            37.81 |           0.61x |
|     20% |      75% |         22.89 |            37.81 |           0.61x |
|     10% |       0% |         38.15 |            37.81 |           1.01x |
|     10% |      25% |         41.96 |            37.81 |           1.11x |
|     10% |      50% |         41.96 |            37.81 |           1.11x |
|     10% |      75% |         41.96 |            37.81 |           1.11x |

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and perfect_hj
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃  perfect_hj ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2728.43 ms │  2746.77 ms │ no change │
│ QQuery 1     │  1259.76 ms │  1286.32 ms │ no change │
│ QQuery 2     │  2442.97 ms │  2404.14 ms │ no change │
│ QQuery 3     │  1146.01 ms │  1175.40 ms │ no change │
│ QQuery 4     │  2324.58 ms │  2297.95 ms │ no change │
│ QQuery 5     │ 28847.71 ms │ 28651.86 ms │ no change │
│ QQuery 6     │  3925.44 ms │  4071.92 ms │ no change │
│ QQuery 7     │  3480.92 ms │  3513.25 ms │ no change │
└──────────────┴─────────────┴─────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 46155.84ms │
│ Total Time (perfect_hj)   │ 46147.61ms │
│ Average Time (HEAD)       │  5769.48ms │
│ Average Time (perfect_hj) │  5768.45ms │
│ Queries Faster            │          0 │
│ Queries Slower            │          0 │
│ Queries with No Change    │          8 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃  perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.55 ms │     2.28 ms │ +1.12x faster │
│ QQuery 1     │    49.94 ms │    49.72 ms │     no change │
│ QQuery 2     │   136.75 ms │   135.77 ms │     no change │
│ QQuery 3     │   157.55 ms │   156.38 ms │     no change │
│ QQuery 4     │  1127.59 ms │  1099.95 ms │     no change │
│ QQuery 5     │  1513.28 ms │  1540.48 ms │     no change │
│ QQuery 6     │     2.09 ms │     2.18 ms │     no change │
│ QQuery 7     │    54.92 ms │    54.54 ms │     no change │
│ QQuery 8     │  1435.36 ms │  1429.02 ms │     no change │
│ QQuery 9     │  1826.50 ms │  1818.96 ms │     no change │
│ QQuery 10    │   362.05 ms │   359.18 ms │     no change │
│ QQuery 11    │   413.64 ms │   415.45 ms │     no change │
│ QQuery 12    │  1346.02 ms │  1355.55 ms │     no change │
│ QQuery 13    │  2003.64 ms │  2030.93 ms │     no change │
│ QQuery 14    │  1258.62 ms │  1264.24 ms │     no change │
│ QQuery 15    │  1236.44 ms │  1220.63 ms │     no change │
│ QQuery 16    │  2660.06 ms │  2679.40 ms │     no change │
│ QQuery 17    │  2646.48 ms │  2669.57 ms │     no change │
│ QQuery 18    │  5032.89 ms │  5243.53 ms │     no change │
│ QQuery 19    │   119.08 ms │   120.08 ms │     no change │
│ QQuery 20    │  1909.83 ms │  1955.97 ms │     no change │
│ QQuery 21    │  2219.56 ms │  2228.45 ms │     no change │
│ QQuery 22    │  3754.10 ms │  3742.17 ms │     no change │
│ QQuery 23    │ 12284.83 ms │ 12304.70 ms │     no change │
│ QQuery 24    │   200.89 ms │   216.80 ms │  1.08x slower │
│ QQuery 25    │   459.79 ms │   477.52 ms │     no change │
│ QQuery 26    │   217.46 ms │   226.68 ms │     no change │
│ QQuery 27    │  2733.98 ms │  2714.81 ms │     no change │
│ QQuery 28    │ 24706.02 ms │ 23446.26 ms │ +1.05x faster │
│ QQuery 29    │   983.86 ms │   951.63 ms │     no change │
│ QQuery 30    │  1325.66 ms │  1334.96 ms │     no change │
│ QQuery 31    │  1361.59 ms │  1319.42 ms │     no change │
│ QQuery 32    │  4654.91 ms │  4894.57 ms │  1.05x slower │
│ QQuery 33    │  5895.01 ms │  5580.74 ms │ +1.06x faster │
│ QQuery 34    │  5921.45 ms │  5968.72 ms │     no change │
│ QQuery 35    │  1944.76 ms │  1894.39 ms │     no change │
│ QQuery 36    │    66.49 ms │    65.88 ms │     no change │
│ QQuery 37    │    45.71 ms │    47.05 ms │     no change │
│ QQuery 38    │    66.55 ms │    67.85 ms │     no change │
│ QQuery 39    │   105.16 ms │   104.80 ms │     no change │
│ QQuery 40    │    28.03 ms │    27.95 ms │     no change │
│ QQuery 41    │    23.88 ms │    24.08 ms │     no change │
│ QQuery 42    │    19.99 ms │    20.73 ms │     no change │
└──────────────┴─────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 94314.96ms │
│ Total Time (perfect_hj)   │ 93263.97ms │
│ Average Time (HEAD)       │  2193.37ms │
│ Average Time (perfect_hj) │  2168.93ms │
│ Queries Faster            │          3 │
│ Queries Slower            │          2 │
│ Queries with No Change    │         38 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 139.34 ms │  130.00 ms │ +1.07x faster │
│ QQuery 2     │  27.17 ms │   23.45 ms │ +1.16x faster │
│ QQuery 3     │  40.09 ms │   38.79 ms │     no change │
│ QQuery 4     │  29.24 ms │   28.73 ms │     no change │
│ QQuery 5     │  89.41 ms │   88.45 ms │     no change │
│ QQuery 6     │  20.11 ms │   19.98 ms │     no change │
│ QQuery 7     │ 233.64 ms │  193.84 ms │ +1.21x faster │
│ QQuery 8     │  40.29 ms │   38.67 ms │     no change │
│ QQuery 9     │ 108.11 ms │  102.06 ms │ +1.06x faster │
│ QQuery 10    │  64.69 ms │   64.74 ms │     no change │
│ QQuery 11    │  18.64 ms │   11.46 ms │ +1.63x faster │
│ QQuery 12    │  51.76 ms │   49.96 ms │     no change │
│ QQuery 13    │  47.87 ms │   48.78 ms │     no change │
│ QQuery 14    │  14.03 ms │   14.30 ms │     no change │
│ QQuery 15    │  25.20 ms │   25.20 ms │     no change │
│ QQuery 16    │  25.60 ms │   24.53 ms │     no change │
│ QQuery 17    │ 156.37 ms │  151.59 ms │     no change │
│ QQuery 18    │ 283.58 ms │  283.26 ms │     no change │
│ QQuery 19    │  35.96 ms │   38.98 ms │  1.08x slower │
│ QQuery 20    │  48.62 ms │   50.36 ms │     no change │
│ QQuery 21    │ 312.85 ms │  309.67 ms │     no change │
│ QQuery 22    │  17.67 ms │   18.01 ms │     no change │
└──────────────┴───────────┴────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary         ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 1830.21ms │
│ Total Time (perfect_hj)   │ 1754.80ms │
│ Average Time (HEAD)       │   83.19ms │
│ Average Time (perfect_hj) │   79.76ms │
│ Queries Faster            │         5 │
│ Queries Slower            │         1 │
│ Queries with No Change    │        16 │
│ Queries with Failure      │         0 │
└───────────────────────────┴───────────┘

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing perfect_hj (a442460) to 8550010 diff using: tpcds
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and perfect_hj
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃  perfect_hj ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │    64.32 ms │    64.03 ms │     no change │
│ QQuery 2     │   202.76 ms │   209.93 ms │     no change │
│ QQuery 3     │   160.17 ms │   160.30 ms │     no change │
│ QQuery 4     │  2076.33 ms │  2016.14 ms │     no change │
│ QQuery 5     │   271.72 ms │   267.86 ms │     no change │
│ QQuery 6     │  1539.80 ms │  1528.76 ms │     no change │
│ QQuery 7     │   501.34 ms │   500.09 ms │     no change │
│ QQuery 8     │   172.26 ms │   172.88 ms │     no change │
│ QQuery 9     │   283.58 ms │   274.82 ms │     no change │
│ QQuery 10    │   169.65 ms │   172.34 ms │     no change │
│ QQuery 11    │  1414.43 ms │  1436.15 ms │     no change │
│ QQuery 12    │    74.91 ms │    73.81 ms │     no change │
│ QQuery 13    │   554.37 ms │   557.49 ms │     no change │
│ QQuery 14    │  2016.91 ms │  1803.46 ms │ +1.12x faster │
│ QQuery 15    │    28.50 ms │    27.30 ms │     no change │
│ QQuery 16    │    56.60 ms │    57.60 ms │     no change │
│ QQuery 17    │   354.64 ms │   357.88 ms │     no change │
│ QQuery 18    │   191.69 ms │   190.50 ms │     no change │
│ QQuery 19    │   228.28 ms │   229.67 ms │     no change │
│ QQuery 20    │    23.36 ms │    23.50 ms │     no change │
│ QQuery 21    │    33.71 ms │    32.15 ms │     no change │
│ QQuery 22    │   985.29 ms │   906.05 ms │ +1.09x faster │
│ QQuery 23    │  1845.20 ms │  1713.30 ms │ +1.08x faster │
│ QQuery 24    │   642.89 ms │   636.64 ms │     no change │
│ QQuery 25    │   523.12 ms │   516.07 ms │     no change │
│ QQuery 26    │   123.94 ms │   126.15 ms │     no change │
│ QQuery 27    │   504.93 ms │   500.48 ms │     no change │
│ QQuery 28    │   303.81 ms │   302.04 ms │     no change │
│ QQuery 29    │   444.98 ms │   447.70 ms │     no change │
│ QQuery 30    │    65.01 ms │    63.94 ms │     no change │
│ QQuery 31    │   306.15 ms │   306.66 ms │     no change │
│ QQuery 32    │    77.86 ms │    78.00 ms │     no change │
│ QQuery 33    │   193.25 ms │   194.57 ms │     no change │
│ QQuery 34    │   159.04 ms │   162.10 ms │     no change │
│ QQuery 35    │   169.34 ms │   171.45 ms │     no change │
│ QQuery 36    │   296.09 ms │   286.26 ms │     no change │
│ QQuery 37    │   257.06 ms │   254.49 ms │     no change │
│ QQuery 38    │   155.26 ms │   151.49 ms │     no change │
│ QQuery 39    │   216.46 ms │   194.61 ms │ +1.11x faster │
│ QQuery 40    │   197.40 ms │   172.47 ms │ +1.14x faster │
│ QQuery 41    │    16.30 ms │    16.66 ms │     no change │
│ QQuery 42    │   142.70 ms │   145.52 ms │     no change │
│ QQuery 43    │   126.26 ms │   129.13 ms │     no change │
│ QQuery 44    │    15.68 ms │    14.96 ms │     no change │
│ QQuery 45    │    83.14 ms │    80.86 ms │     no change │
│ QQuery 46    │   329.53 ms │   329.04 ms │     no change │
│ QQuery 47    │  1279.95 ms │  1186.17 ms │ +1.08x faster │
│ QQuery 48    │   416.35 ms │   420.97 ms │     no change │
│ QQuery 49    │   355.14 ms │   354.37 ms │     no change │
│ QQuery 50    │   338.48 ms │   343.17 ms │     no change │
│ QQuery 51    │   298.83 ms │   294.59 ms │     no change │
│ QQuery 52    │   147.14 ms │   143.27 ms │     no change │
│ QQuery 53    │   149.53 ms │   146.40 ms │     no change │
│ QQuery 54    │   207.43 ms │   216.19 ms │     no change │
│ QQuery 55    │   144.21 ms │   144.60 ms │     no change │
│ QQuery 56    │   193.16 ms │   190.45 ms │     no change │
│ QQuery 57    │   319.98 ms │   302.28 ms │ +1.06x faster │
│ QQuery 58    │   515.69 ms │   482.75 ms │ +1.07x faster │
│ QQuery 59    │   285.93 ms │   288.57 ms │     no change │
│ QQuery 60    │   199.98 ms │   199.54 ms │     no change │
│ QQuery 61    │   240.08 ms │   235.47 ms │     no change │
│ QQuery 62    │  1378.44 ms │  1398.12 ms │     no change │
│ QQuery 63    │   149.43 ms │   148.88 ms │     no change │
│ QQuery 64    │  1142.94 ms │  1161.26 ms │     no change │
│ QQuery 65    │   368.38 ms │   363.41 ms │     no change │
│ QQuery 66    │   389.40 ms │   393.14 ms │     no change │
│ QQuery 67    │   646.76 ms │   627.61 ms │     no change │
│ QQuery 68    │   383.87 ms │   384.78 ms │     no change │
│ QQuery 69    │   167.41 ms │   170.00 ms │     no change │
│ QQuery 70    │   526.93 ms │   528.35 ms │     no change │
│ QQuery 71    │   183.79 ms │   184.70 ms │     no change │
│ QQuery 72    │  2474.00 ms │  2083.02 ms │ +1.19x faster │
│ QQuery 73    │   156.03 ms │   157.55 ms │     no change │
│ QQuery 74    │   893.90 ms │   899.19 ms │     no change │
│ QQuery 75    │   394.28 ms │   389.15 ms │     no change │
│ QQuery 76    │   184.12 ms │   182.03 ms │     no change │
│ QQuery 77    │   265.99 ms │   262.24 ms │     no change │
│ QQuery 78    │   939.94 ms │   953.61 ms │     no change │
│ QQuery 79    │   341.30 ms │   340.35 ms │     no change │
│ QQuery 80    │   496.20 ms │   493.37 ms │     no change │
│ QQuery 81    │    43.41 ms │    40.77 ms │ +1.06x faster │
│ QQuery 82    │   296.71 ms │   283.45 ms │     no change │
│ QQuery 83    │    67.95 ms │    62.11 ms │ +1.09x faster │
│ QQuery 84    │    64.72 ms │    65.66 ms │     no change │
│ QQuery 85    │   214.94 ms │   226.66 ms │  1.05x slower │
│ QQuery 86    │    60.18 ms │    57.66 ms │     no change │
│ QQuery 87    │   151.20 ms │   157.81 ms │     no change │
│ QQuery 88    │   243.17 ms │   250.08 ms │     no change │
│ QQuery 89    │   174.02 ms │   172.62 ms │     no change │
│ QQuery 90    │    36.80 ms │    37.36 ms │     no change │
│ QQuery 91    │    95.95 ms │    92.97 ms │     no change │
│ QQuery 92    │    78.25 ms │    77.30 ms │     no change │
│ QQuery 93    │   270.50 ms │   263.82 ms │     no change │
│ QQuery 94    │    85.61 ms │    84.43 ms │     no change │
│ QQuery 95    │   260.54 ms │   228.76 ms │ +1.14x faster │
│ QQuery 96    │   111.78 ms │   110.86 ms │     no change │
│ QQuery 97    │   186.45 ms │   187.03 ms │     no change │
│ QQuery 98    │   235.54 ms │   237.35 ms │     no change │
│ QQuery 99    │ 14895.30 ms │ 15020.29 ms │     no change │
└──────────────┴─────────────┴─────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary         ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)         │ 52748.01ms │
│ Total Time (perfect_hj)   │ 51783.83ms │
│ Average Time (HEAD)       │   532.81ms │
│ Average Time (perfect_hj) │   523.07ms │
│ Queries Faster            │         12 │
│ Queries Slower            │          1 │
│ Queries with No Change    │         86 │
│ Queries with Failure      │          0 │
└───────────────────────────┴────────────┘

comparison = BenchmarkRun.load_from_file(comparison_path)

console = Console()
console = Console(width=200)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've increased the console width to 200. I added more information like 'density' to the queryName, which made it longer and caused it to be cut off in the output before

baseline_header = baseline_path.parent.stem
comparison_header = comparison_path.parent.stem
baseline_header = baseline_path.parent.name
comparison_header = comparison_path.parent.name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, a path like .../density=0.1/... was incorrectly shortened to density=0. Now, by using .parent.name, we correctly get the full directory name, density=0.1

@github-actions github-actions bot added proto Related to proto crate documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) labels Dec 23, 2025
@UBarney UBarney force-pushed the perfect_hj branch 4 times, most recently from 8044bce to 954f1df Compare December 28, 2025 08:02
@UBarney UBarney marked this pull request as ready for review December 28, 2025 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate documentation Improvements or additions to documentation physical-plan Changes to the physical-plan crate proto Related to proto crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[EPIC]: Perfect Hash Join

3 participants