Skip to content

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Jan 3, 2026

Which issue does this PR close?

Rationale for this change

Speedup accumulator code (sum, avg, count) by specializing on non-null cases.

What changes are included in this PR?

  • Specialize Nullstate to non-null values.
  • Use unchecked indexing

Are these changes tested?

Are there any user-facing changes?

@Dandandan
Copy link
Contributor Author

run benchmark tpch

@github-actions github-actions bot added the functions Changes to functions implementation label Jan 3, 2026
@Dandandan
Copy link
Contributor Author

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (2e70075) to 70daf88 diff using: tpch
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and speedup_accumulate2
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 195.98 ms │           177.99 ms │ +1.10x faster │
│ QQuery 2     │  93.00 ms │            93.10 ms │     no change │
│ QQuery 3     │ 125.87 ms │           129.76 ms │     no change │
│ QQuery 4     │  76.91 ms │            77.23 ms │     no change │
│ QQuery 5     │ 173.24 ms │           172.56 ms │     no change │
│ QQuery 6     │  66.55 ms │            60.69 ms │ +1.10x faster │
│ QQuery 7     │ 213.23 ms │           212.37 ms │     no change │
│ QQuery 8     │ 163.32 ms │           159.22 ms │     no change │
│ QQuery 9     │ 222.59 ms │           225.17 ms │     no change │
│ QQuery 10    │ 183.45 ms │           186.90 ms │     no change │
│ QQuery 11    │  73.45 ms │            73.71 ms │     no change │
│ QQuery 12    │ 119.27 ms │           119.24 ms │     no change │
│ QQuery 13    │ 217.80 ms │           211.53 ms │     no change │
│ QQuery 14    │  88.21 ms │            92.57 ms │     no change │
│ QQuery 15    │ 121.09 ms │           118.33 ms │     no change │
│ QQuery 16    │  55.94 ms │            56.40 ms │     no change │
│ QQuery 17    │ 271.09 ms │           263.09 ms │     no change │
│ QQuery 18    │ 323.32 ms │           307.03 ms │ +1.05x faster │
│ QQuery 19    │ 133.86 ms │           131.78 ms │     no change │
│ QQuery 20    │ 125.14 ms │           125.98 ms │     no change │
│ QQuery 21    │ 258.77 ms │           264.91 ms │     no change │
│ QQuery 22    │  41.10 ms │            43.76 ms │  1.06x slower │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 3343.17ms │
│ Total Time (speedup_accumulate2)   │ 3303.33ms │
│ Average Time (HEAD)                │  151.96ms │
│ Average Time (speedup_accumulate2) │  150.15ms │
│ Queries Faster                     │         3 │
│ Queries Slower                     │         1 │
│ Queries with No Change             │        18 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (2e70075) to 70daf88 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and speedup_accumulate2
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2396.32 ms │          2401.55 ms │     no change │
│ QQuery 1     │   968.67 ms │           948.40 ms │     no change │
│ QQuery 2     │  1904.98 ms │          1904.49 ms │     no change │
│ QQuery 3     │  1189.95 ms │          1082.74 ms │ +1.10x faster │
│ QQuery 4     │  2302.46 ms │          2257.02 ms │     no change │
│ QQuery 5     │ 28185.03 ms │         28318.68 ms │     no change │
│ QQuery 6     │  3989.47 ms │          3954.18 ms │     no change │
│ QQuery 7     │  3607.90 ms │          3402.52 ms │ +1.06x faster │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 44544.78ms │
│ Total Time (speedup_accumulate2)   │ 44269.58ms │
│ Average Time (HEAD)                │  5568.10ms │
│ Average Time (speedup_accumulate2) │  5533.70ms │
│ Queries Faster                     │          2 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          6 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.44 ms │             1.47 ms │     no change │
│ QQuery 1     │    49.61 ms │            51.67 ms │     no change │
│ QQuery 2     │   133.37 ms │           135.39 ms │     no change │
│ QQuery 3     │   153.38 ms │           155.18 ms │     no change │
│ QQuery 4     │  1080.14 ms │          1100.08 ms │     no change │
│ QQuery 5     │  1374.56 ms │          1365.30 ms │     no change │
│ QQuery 6     │     1.44 ms │             1.45 ms │     no change │
│ QQuery 7     │    54.79 ms │            55.54 ms │     no change │
│ QQuery 8     │  1431.19 ms │          1436.26 ms │     no change │
│ QQuery 9     │  1834.80 ms │          1760.94 ms │     no change │
│ QQuery 10    │   343.66 ms │           346.76 ms │     no change │
│ QQuery 11    │   392.50 ms │           406.43 ms │     no change │
│ QQuery 12    │  1250.57 ms │          1282.74 ms │     no change │
│ QQuery 13    │  1948.36 ms │          1901.69 ms │     no change │
│ QQuery 14    │  1245.12 ms │          1252.87 ms │     no change │
│ QQuery 15    │  1253.71 ms │          1240.80 ms │     no change │
│ QQuery 16    │  2604.43 ms │          2638.39 ms │     no change │
│ QQuery 17    │  2588.56 ms │          2543.25 ms │     no change │
│ QQuery 18    │  5552.06 ms │          4873.72 ms │ +1.14x faster │
│ QQuery 19    │   121.28 ms │           118.29 ms │     no change │
│ QQuery 20    │  1924.64 ms │          1852.19 ms │     no change │
│ QQuery 21    │  2220.98 ms │          2110.61 ms │     no change │
│ QQuery 22    │  3838.07 ms │          3633.27 ms │ +1.06x faster │
│ QQuery 23    │ 17788.42 ms │         12106.29 ms │ +1.47x faster │
│ QQuery 24    │   224.44 ms │           211.69 ms │ +1.06x faster │
│ QQuery 25    │   475.26 ms │           443.87 ms │ +1.07x faster │
│ QQuery 26    │   202.16 ms │           209.78 ms │     no change │
│ QQuery 27    │  2850.97 ms │          2618.79 ms │ +1.09x faster │
│ QQuery 28    │ 23800.99 ms │         24347.77 ms │     no change │
│ QQuery 29    │   942.77 ms │          1011.45 ms │  1.07x slower │
│ QQuery 30    │  1307.75 ms │          1248.78 ms │     no change │
│ QQuery 31    │  1366.48 ms │          1289.24 ms │ +1.06x faster │
│ QQuery 32    │  5105.69 ms │          4974.34 ms │     no change │
│ QQuery 33    │  5738.49 ms │          5365.07 ms │ +1.07x faster │
│ QQuery 34    │  5800.86 ms │          5667.67 ms │     no change │
│ QQuery 35    │  1986.02 ms │          1876.39 ms │ +1.06x faster │
│ QQuery 36    │    65.03 ms │            63.81 ms │     no change │
│ QQuery 37    │    44.32 ms │            43.01 ms │     no change │
│ QQuery 38    │    65.12 ms │            64.24 ms │     no change │
│ QQuery 39    │   100.76 ms │           101.10 ms │     no change │
│ QQuery 40    │    24.99 ms │            25.83 ms │     no change │
│ QQuery 41    │    22.47 ms │            22.20 ms │     no change │
│ QQuery 42    │    19.13 ms │            20.06 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 99330.77ms │
│ Total Time (speedup_accumulate2)   │ 91975.66ms │
│ Average Time (HEAD)                │  2310.02ms │
│ Average Time (speedup_accumulate2) │  2138.97ms │
│ Queries Faster                     │          9 │
│ Queries Slower                     │          1 │
│ Queries with No Change             │         33 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 118.98 ms │           102.93 ms │ +1.16x faster │
│ QQuery 2     │  27.10 ms │            27.74 ms │     no change │
│ QQuery 3     │  38.52 ms │            37.98 ms │     no change │
│ QQuery 4     │  28.23 ms │            29.33 ms │     no change │
│ QQuery 5     │  86.26 ms │            87.18 ms │     no change │
│ QQuery 6     │  20.01 ms │            19.86 ms │     no change │
│ QQuery 7     │ 229.88 ms │           223.06 ms │     no change │
│ QQuery 8     │  34.32 ms │            36.07 ms │  1.05x slower │
│ QQuery 9     │ 104.16 ms │            99.28 ms │     no change │
│ QQuery 10    │  62.37 ms │            64.49 ms │     no change │
│ QQuery 11    │  16.19 ms │            17.94 ms │  1.11x slower │
│ QQuery 12    │  49.55 ms │            49.97 ms │     no change │
│ QQuery 13    │  47.28 ms │            48.64 ms │     no change │
│ QQuery 14    │  13.42 ms │            13.30 ms │     no change │
│ QQuery 15    │  23.99 ms │            24.16 ms │     no change │
│ QQuery 16    │  24.17 ms │            24.46 ms │     no change │
│ QQuery 17    │ 148.07 ms │           143.70 ms │     no change │
│ QQuery 18    │ 278.87 ms │           275.03 ms │     no change │
│ QQuery 19    │  39.61 ms │            36.85 ms │ +1.07x faster │
│ QQuery 20    │  48.85 ms │            50.28 ms │     no change │
│ QQuery 21    │ 317.83 ms │           316.43 ms │     no change │
│ QQuery 22    │  17.22 ms │            17.77 ms │     no change │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1774.88ms │
│ Total Time (speedup_accumulate2)   │ 1746.44ms │
│ Average Time (HEAD)                │   80.68ms │
│ Average Time (speedup_accumulate2) │   79.38ms │
│ Queries Faster                     │         2 │
│ Queries Slower                     │         2 │
│ Queries with No Change             │        18 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@Dandandan Dandandan marked this pull request as ready for review January 3, 2026 16:50
@Dandandan
Copy link
Contributor Author

│ QQuery 1 │ 118.98 ms │ 102.93 ms │ +1.16x faster │

Looks like it is a nice win.

@Dandandan Dandandan requested a review from alamb January 3, 2026 17:23
@Dandandan
Copy link
Contributor Author

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (2e70075) to 70daf88 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@Dandandan Dandandan changed the title Speedup accumulators Optimize Nullstate / accumulators Jan 3, 2026
@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and speedup_accumulate2
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2364.93 ms │          2305.51 ms │     no change │
│ QQuery 1     │   910.69 ms │           924.96 ms │     no change │
│ QQuery 2     │  1939.92 ms │          1863.22 ms │     no change │
│ QQuery 3     │  1215.90 ms │          1099.86 ms │ +1.11x faster │
│ QQuery 4     │  2275.35 ms │          2224.26 ms │     no change │
│ QQuery 5     │ 28223.77 ms │         28200.15 ms │     no change │
│ QQuery 6     │  4000.90 ms │          3956.86 ms │     no change │
│ QQuery 7     │  3448.92 ms │          3338.27 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 44380.39ms │
│ Total Time (speedup_accumulate2)   │ 43913.10ms │
│ Average Time (HEAD)                │  5547.55ms │
│ Average Time (speedup_accumulate2) │  5489.14ms │
│ Queries Faster                     │          1 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          7 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.44 ms │             1.45 ms │     no change │
│ QQuery 1     │    49.84 ms │            49.41 ms │     no change │
│ QQuery 2     │   131.95 ms │           135.29 ms │     no change │
│ QQuery 3     │   156.93 ms │           148.66 ms │ +1.06x faster │
│ QQuery 4     │  1067.52 ms │          1035.32 ms │     no change │
│ QQuery 5     │  1345.19 ms │          1345.68 ms │     no change │
│ QQuery 6     │     1.44 ms │             1.42 ms │     no change │
│ QQuery 7     │    53.58 ms │            55.38 ms │     no change │
│ QQuery 8     │  1466.22 ms │          1411.60 ms │     no change │
│ QQuery 9     │  1851.63 ms │          1798.48 ms │     no change │
│ QQuery 10    │   342.01 ms │           349.47 ms │     no change │
│ QQuery 11    │   395.10 ms │           408.47 ms │     no change │
│ QQuery 12    │  1276.70 ms │          1248.48 ms │     no change │
│ QQuery 13    │  1967.30 ms │          1965.99 ms │     no change │
│ QQuery 14    │  1263.45 ms │          1230.43 ms │     no change │
│ QQuery 15    │  1257.20 ms │          1211.84 ms │     no change │
│ QQuery 16    │  2604.95 ms │          2512.42 ms │     no change │
│ QQuery 17    │  2535.78 ms │          2516.91 ms │     no change │
│ QQuery 18    │  5090.06 ms │          4786.19 ms │ +1.06x faster │
│ QQuery 19    │   116.90 ms │           119.93 ms │     no change │
│ QQuery 20    │  1880.12 ms │          1822.01 ms │     no change │
│ QQuery 21    │  2195.23 ms │          2119.26 ms │     no change │
│ QQuery 22    │  3812.42 ms │          3667.21 ms │     no change │
│ QQuery 23    │ 14735.79 ms │         12032.40 ms │ +1.22x faster │
│ QQuery 24    │   215.78 ms │           203.87 ms │ +1.06x faster │
│ QQuery 25    │   469.34 ms │           458.36 ms │     no change │
│ QQuery 26    │   225.23 ms │           210.02 ms │ +1.07x faster │
│ QQuery 27    │  2773.05 ms │          2650.96 ms │     no change │
│ QQuery 28    │ 23602.84 ms │         24485.25 ms │     no change │
│ QQuery 29    │   943.91 ms │           975.03 ms │     no change │
│ QQuery 30    │  1352.11 ms │          1228.35 ms │ +1.10x faster │
│ QQuery 31    │  1384.11 ms │          1275.69 ms │ +1.08x faster │
│ QQuery 32    │  4762.69 ms │          4470.40 ms │ +1.07x faster │
│ QQuery 33    │  5680.36 ms │          5159.97 ms │ +1.10x faster │
│ QQuery 34    │  5672.22 ms │          5672.36 ms │     no change │
│ QQuery 35    │  1944.93 ms │          1911.46 ms │     no change │
│ QQuery 36    │    66.09 ms │            67.65 ms │     no change │
│ QQuery 37    │    44.41 ms │            42.84 ms │     no change │
│ QQuery 38    │    66.74 ms │            65.53 ms │     no change │
│ QQuery 39    │    99.97 ms │           104.32 ms │     no change │
│ QQuery 40    │    25.64 ms │            26.62 ms │     no change │
│ QQuery 41    │    22.72 ms │            23.88 ms │  1.05x slower │
│ QQuery 42    │    18.73 ms │            19.52 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 94969.61ms │
│ Total Time (speedup_accumulate2)   │ 91025.77ms │
│ Average Time (HEAD)                │  2208.60ms │
│ Average Time (speedup_accumulate2) │  2116.88ms │
│ Queries Faster                     │          9 │
│ Queries Slower                     │          1 │
│ Queries with No Change             │         33 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 119.02 ms │           104.49 ms │ +1.14x faster │
│ QQuery 2     │  29.62 ms │            27.01 ms │ +1.10x faster │
│ QQuery 3     │  36.09 ms │            32.81 ms │ +1.10x faster │
│ QQuery 4     │  29.36 ms │            29.39 ms │     no change │
│ QQuery 5     │  87.85 ms │            86.50 ms │     no change │
│ QQuery 6     │  19.97 ms │            19.63 ms │     no change │
│ QQuery 7     │ 219.06 ms │           225.74 ms │     no change │
│ QQuery 8     │  33.16 ms │            33.57 ms │     no change │
│ QQuery 9     │ 102.39 ms │           101.95 ms │     no change │
│ QQuery 10    │  61.38 ms │            61.75 ms │     no change │
│ QQuery 11    │  17.81 ms │            16.50 ms │ +1.08x faster │
│ QQuery 12    │  51.31 ms │            50.94 ms │     no change │
│ QQuery 13    │  46.73 ms │            48.50 ms │     no change │
│ QQuery 14    │  13.41 ms │            13.24 ms │     no change │
│ QQuery 15    │  24.09 ms │            24.25 ms │     no change │
│ QQuery 16    │  24.32 ms │            23.57 ms │     no change │
│ QQuery 17    │ 147.16 ms │           141.32 ms │     no change │
│ QQuery 18    │ 273.15 ms │           275.10 ms │     no change │
│ QQuery 19    │  36.85 ms │            37.84 ms │     no change │
│ QQuery 20    │  50.20 ms │            48.36 ms │     no change │
│ QQuery 21    │ 287.40 ms │           300.98 ms │     no change │
│ QQuery 22    │  17.32 ms │            17.12 ms │     no change │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1727.63ms │
│ Total Time (speedup_accumulate2)   │ 1720.53ms │
│ Average Time (HEAD)                │   78.53ms │
│ Average Time (speedup_accumulate2) │   78.21ms │
│ Queries Faster                     │         4 │
│ Queries Slower                     │         0 │
│ Queries with No Change             │        18 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@Dandandan
Copy link
Contributor Author

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (fb249d6) to 70daf88 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

Benchmark script failed with exit code 101.

Last 10 lines of output:

Click to expand
    |            ^^^
help: consider using `Option::expect` to unwrap the `Option<Option<NullBuffer>>` value, panicking if the value is an `Option::None`
    |
880 |         let sums = PrimitiveArray::<T>::new(sums.into(), nulls.expect("REASON")) // zero copy
    |                                                               +++++++++++++++++

Some errors have detailed explanations: E0308, E0599, E0624.
For more information about an error, try `rustc --explain E0308`.
error: could not compile `datafusion-functions-aggregate` (lib) due to 7 previous errors
warning: build failed, waiting for other jobs to finish...

@Dandandan
Copy link
Contributor Author

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (bdeda6a) to 70daf88 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and speedup_accumulate2
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2373.27 ms │          2395.72 ms │     no change │
│ QQuery 1     │   942.08 ms │           911.66 ms │     no change │
│ QQuery 2     │  1854.86 ms │          1869.03 ms │     no change │
│ QQuery 3     │  1209.39 ms │          1088.92 ms │ +1.11x faster │
│ QQuery 4     │  2263.96 ms │          2234.55 ms │     no change │
│ QQuery 5     │ 28186.85 ms │         28100.66 ms │     no change │
│ QQuery 6     │  4005.89 ms │          3956.68 ms │     no change │
│ QQuery 7     │  3588.24 ms │          2855.47 ms │ +1.26x faster │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 44424.54ms │
│ Total Time (speedup_accumulate2)   │ 43412.69ms │
│ Average Time (HEAD)                │  5553.07ms │
│ Average Time (speedup_accumulate2) │  5426.59ms │
│ Queries Faster                     │          2 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          6 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.41 ms │             1.46 ms │     no change │
│ QQuery 1     │    48.45 ms │            50.70 ms │     no change │
│ QQuery 2     │   134.17 ms │           133.22 ms │     no change │
│ QQuery 3     │   154.16 ms │           153.01 ms │     no change │
│ QQuery 4     │  1054.42 ms │          1114.63 ms │  1.06x slower │
│ QQuery 5     │  1385.25 ms │          1384.91 ms │     no change │
│ QQuery 6     │     1.44 ms │             1.42 ms │     no change │
│ QQuery 7     │    53.88 ms │            55.16 ms │     no change │
│ QQuery 8     │  1429.95 ms │          1522.33 ms │  1.06x slower │
│ QQuery 9     │  1866.99 ms │          1877.97 ms │     no change │
│ QQuery 10    │   345.56 ms │           343.67 ms │     no change │
│ QQuery 11    │   395.16 ms │           391.14 ms │     no change │
│ QQuery 12    │  1269.03 ms │          1310.46 ms │     no change │
│ QQuery 13    │  1948.75 ms │          1965.13 ms │     no change │
│ QQuery 14    │  1237.28 ms │          1297.19 ms │     no change │
│ QQuery 15    │  1237.17 ms │          1297.92 ms │     no change │
│ QQuery 16    │  2577.77 ms │          2590.83 ms │     no change │
│ QQuery 17    │  2535.10 ms │          2569.87 ms │     no change │
│ QQuery 18    │  5506.90 ms │          4897.52 ms │ +1.12x faster │
│ QQuery 19    │   119.36 ms │           123.21 ms │     no change │
│ QQuery 20    │  1900.62 ms │          1844.48 ms │     no change │
│ QQuery 21    │  2215.21 ms │          2152.16 ms │     no change │
│ QQuery 22    │  3830.99 ms │          3677.16 ms │     no change │
│ QQuery 23    │ 12366.74 ms │         12186.44 ms │     no change │
│ QQuery 24    │   212.29 ms │           211.25 ms │     no change │
│ QQuery 25    │   475.00 ms │           456.82 ms │     no change │
│ QQuery 26    │   220.23 ms │           213.57 ms │     no change │
│ QQuery 27    │  2769.03 ms │          2654.80 ms │     no change │
│ QQuery 28    │ 23318.28 ms │         24396.04 ms │     no change │
│ QQuery 29    │   943.58 ms │           973.65 ms │     no change │
│ QQuery 30    │  1318.27 ms │          1278.99 ms │     no change │
│ QQuery 31    │  1344.91 ms │          1306.33 ms │     no change │
│ QQuery 32    │  5032.81 ms │          4454.45 ms │ +1.13x faster │
│ QQuery 33    │  5567.72 ms │          5389.87 ms │     no change │
│ QQuery 34    │  5638.89 ms │          6574.74 ms │  1.17x slower │
│ QQuery 35    │  1940.63 ms │          1933.81 ms │     no change │
│ QQuery 36    │    64.13 ms │            63.61 ms │     no change │
│ QQuery 37    │    44.81 ms │            43.48 ms │     no change │
│ QQuery 38    │    66.09 ms │            64.95 ms │     no change │
│ QQuery 39    │   102.02 ms │           100.71 ms │     no change │
│ QQuery 40    │    26.86 ms │            27.24 ms │     no change │
│ QQuery 41    │    22.40 ms │            22.13 ms │     no change │
│ QQuery 42    │    19.60 ms │            19.54 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 92743.32ms │
│ Total Time (speedup_accumulate2)   │ 93127.94ms │
│ Average Time (HEAD)                │  2156.82ms │
│ Average Time (speedup_accumulate2) │  2165.77ms │
│ Queries Faster                     │          2 │
│ Queries Slower                     │          3 │
│ Queries with No Change             │         38 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 119.22 ms │           103.20 ms │ +1.16x faster │
│ QQuery 2     │  28.93 ms │            29.34 ms │     no change │
│ QQuery 3     │  37.58 ms │            37.68 ms │     no change │
│ QQuery 4     │  28.39 ms │            28.58 ms │     no change │
│ QQuery 5     │  86.40 ms │            85.37 ms │     no change │
│ QQuery 6     │  19.86 ms │            20.07 ms │     no change │
│ QQuery 7     │ 215.03 ms │           223.09 ms │     no change │
│ QQuery 8     │  31.60 ms │            31.79 ms │     no change │
│ QQuery 9     │  96.89 ms │            99.16 ms │     no change │
│ QQuery 10    │  62.40 ms │            63.81 ms │     no change │
│ QQuery 11    │  17.38 ms │            17.70 ms │     no change │
│ QQuery 12    │  48.98 ms │            49.68 ms │     no change │
│ QQuery 13    │  46.67 ms │            47.67 ms │     no change │
│ QQuery 14    │  13.34 ms │            13.30 ms │     no change │
│ QQuery 15    │  24.20 ms │            23.98 ms │     no change │
│ QQuery 16    │  23.64 ms │            24.36 ms │     no change │
│ QQuery 17    │ 148.03 ms │           139.74 ms │ +1.06x faster │
│ QQuery 18    │ 278.92 ms │           267.83 ms │     no change │
│ QQuery 19    │  36.96 ms │            37.55 ms │     no change │
│ QQuery 20    │  49.38 ms │            49.45 ms │     no change │
│ QQuery 21    │ 292.16 ms │           309.29 ms │  1.06x slower │
│ QQuery 22    │  17.03 ms │            17.17 ms │     no change │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1722.99ms │
│ Total Time (speedup_accumulate2)   │ 1719.81ms │
│ Average Time (HEAD)                │   78.32ms │
│ Average Time (speedup_accumulate2) │   78.17ms │
│ Queries Faster                     │         2 │
│ Queries Slower                     │         1 │
│ Queries with No Change             │        19 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@Dandandan
Copy link
Contributor Author

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (05414e8) to 70daf88 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and speedup_accumulate2
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2430.04 ms │          2428.00 ms │     no change │
│ QQuery 1     │   948.60 ms │           933.59 ms │     no change │
│ QQuery 2     │  1873.19 ms │          1885.25 ms │     no change │
│ QQuery 3     │  1217.29 ms │          1116.94 ms │ +1.09x faster │
│ QQuery 4     │  2304.46 ms │          2211.39 ms │     no change │
│ QQuery 5     │ 28674.73 ms │         28270.06 ms │     no change │
│ QQuery 6     │  4044.53 ms │          3973.92 ms │     no change │
│ QQuery 7     │  3594.85 ms │          2729.59 ms │ +1.32x faster │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 45087.69ms │
│ Total Time (speedup_accumulate2)   │ 43548.72ms │
│ Average Time (HEAD)                │  5635.96ms │
│ Average Time (speedup_accumulate2) │  5443.59ms │
│ Queries Faster                     │          2 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          6 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.40 ms │             1.44 ms │     no change │
│ QQuery 1     │    50.89 ms │            50.19 ms │     no change │
│ QQuery 2     │   143.67 ms │           130.41 ms │ +1.10x faster │
│ QQuery 3     │   157.42 ms │           158.61 ms │     no change │
│ QQuery 4     │  1063.41 ms │          1071.67 ms │     no change │
│ QQuery 5     │  1356.53 ms │          1359.50 ms │     no change │
│ QQuery 6     │     1.43 ms │             1.45 ms │     no change │
│ QQuery 7     │    55.62 ms │            55.45 ms │     no change │
│ QQuery 8     │  1416.29 ms │          1408.25 ms │     no change │
│ QQuery 9     │  1843.02 ms │          1789.66 ms │     no change │
│ QQuery 10    │   344.50 ms │           343.59 ms │     no change │
│ QQuery 11    │   401.64 ms │           397.51 ms │     no change │
│ QQuery 12    │  1268.36 ms │          1232.84 ms │     no change │
│ QQuery 13    │  1943.36 ms │          1951.12 ms │     no change │
│ QQuery 14    │  1247.23 ms │          1236.60 ms │     no change │
│ QQuery 15    │  1201.25 ms │          1215.68 ms │     no change │
│ QQuery 16    │  2566.29 ms │          2560.04 ms │     no change │
│ QQuery 17    │  2588.45 ms │          2496.82 ms │     no change │
│ QQuery 18    │  5305.35 ms │          4789.56 ms │ +1.11x faster │
│ QQuery 19    │   116.06 ms │           120.40 ms │     no change │
│ QQuery 20    │  1869.04 ms │          1853.18 ms │     no change │
│ QQuery 21    │  2163.68 ms │          2142.07 ms │     no change │
│ QQuery 22    │  3735.25 ms │          3648.02 ms │     no change │
│ QQuery 23    │ 15366.11 ms │         12071.49 ms │ +1.27x faster │
│ QQuery 24    │   208.60 ms │           194.96 ms │ +1.07x faster │
│ QQuery 25    │   474.28 ms │           449.62 ms │ +1.05x faster │
│ QQuery 26    │   217.26 ms │           211.92 ms │     no change │
│ QQuery 27    │  2812.76 ms │          2680.87 ms │     no change │
│ QQuery 28    │ 23452.34 ms │         24301.96 ms │     no change │
│ QQuery 29    │   960.37 ms │           949.23 ms │     no change │
│ QQuery 30    │  1328.00 ms │          1247.63 ms │ +1.06x faster │
│ QQuery 31    │  1333.53 ms │          1313.53 ms │     no change │
│ QQuery 32    │  4996.13 ms │          4340.39 ms │ +1.15x faster │
│ QQuery 33    │  5589.77 ms │          5356.82 ms │     no change │
│ QQuery 34    │  5703.31 ms │          5747.31 ms │     no change │
│ QQuery 35    │  1931.30 ms │          1910.32 ms │     no change │
│ QQuery 36    │    64.90 ms │            64.96 ms │     no change │
│ QQuery 37    │    44.46 ms │            44.47 ms │     no change │
│ QQuery 38    │    66.37 ms │            63.93 ms │     no change │
│ QQuery 39    │   101.29 ms │            98.61 ms │     no change │
│ QQuery 40    │    25.52 ms │            25.54 ms │     no change │
│ QQuery 41    │    22.95 ms │            22.16 ms │     no change │
│ QQuery 42    │    19.61 ms │            19.94 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 95558.98ms │
│ Total Time (speedup_accumulate2)   │ 91129.72ms │
│ Average Time (HEAD)                │  2222.30ms │
│ Average Time (speedup_accumulate2) │  2119.30ms │
│ Queries Faster                     │          7 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │         36 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 119.96 ms │            99.51 ms │ +1.21x faster │
│ QQuery 2     │  28.32 ms │            28.35 ms │     no change │
│ QQuery 3     │  36.80 ms │            34.15 ms │ +1.08x faster │
│ QQuery 4     │  28.72 ms │            28.13 ms │     no change │
│ QQuery 5     │  85.22 ms │            84.80 ms │     no change │
│ QQuery 6     │  19.65 ms │            19.58 ms │     no change │
│ QQuery 7     │ 221.25 ms │           219.36 ms │     no change │
│ QQuery 8     │  31.07 ms │            35.35 ms │  1.14x slower │
│ QQuery 9     │  97.58 ms │            98.66 ms │     no change │
│ QQuery 10    │  62.74 ms │            62.49 ms │     no change │
│ QQuery 11    │  17.32 ms │            17.49 ms │     no change │
│ QQuery 12    │  49.86 ms │            49.32 ms │     no change │
│ QQuery 13    │  51.52 ms │            48.20 ms │ +1.07x faster │
│ QQuery 14    │  13.22 ms │            13.51 ms │     no change │
│ QQuery 15    │  23.93 ms │            23.71 ms │     no change │
│ QQuery 16    │  24.49 ms │            25.54 ms │     no change │
│ QQuery 17    │ 147.53 ms │           142.85 ms │     no change │
│ QQuery 18    │ 270.08 ms │           270.62 ms │     no change │
│ QQuery 19    │  36.95 ms │            36.76 ms │     no change │
│ QQuery 20    │  49.84 ms │            47.69 ms │     no change │
│ QQuery 21    │ 305.23 ms │           301.66 ms │     no change │
│ QQuery 22    │  17.30 ms │            17.20 ms │     no change │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1738.57ms │
│ Total Time (speedup_accumulate2)   │ 1704.91ms │
│ Average Time (HEAD)                │   79.03ms │
│ Average Time (speedup_accumulate2) │   77.50ms │
│ Queries Faster                     │         3 │
│ Queries Slower                     │         1 │
│ Queries with No Change             │        18 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@Dandandan
Copy link
Contributor Author

run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (05414e8) to 70daf88 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and speedup_accumulate2
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2359.98 ms │          2323.26 ms │     no change │
│ QQuery 1     │   960.92 ms │           937.80 ms │     no change │
│ QQuery 2     │  1900.89 ms │          1856.32 ms │     no change │
│ QQuery 3     │  1164.73 ms │          1088.66 ms │ +1.07x faster │
│ QQuery 4     │  2234.67 ms │          2169.05 ms │     no change │
│ QQuery 5     │ 28132.49 ms │         28395.27 ms │     no change │
│ QQuery 6     │  3982.94 ms │          3927.72 ms │     no change │
│ QQuery 7     │  3396.58 ms │          2683.86 ms │ +1.27x faster │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 44133.20ms │
│ Total Time (speedup_accumulate2)   │ 43381.94ms │
│ Average Time (HEAD)                │  5516.65ms │
│ Average Time (speedup_accumulate2) │  5422.74ms │
│ Queries Faster                     │          2 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          6 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.41 ms │             1.45 ms │     no change │
│ QQuery 1     │    48.51 ms │            48.90 ms │     no change │
│ QQuery 2     │   135.63 ms │           136.84 ms │     no change │
│ QQuery 3     │   154.79 ms │           150.20 ms │     no change │
│ QQuery 4     │  1029.53 ms │          1107.62 ms │  1.08x slower │
│ QQuery 5     │  1333.91 ms │          1373.32 ms │     no change │
│ QQuery 6     │     1.39 ms │             1.48 ms │  1.06x slower │
│ QQuery 7     │    54.75 ms │            54.70 ms │     no change │
│ QQuery 8     │  1386.22 ms │          1443.12 ms │     no change │
│ QQuery 9     │  1737.10 ms │          1841.39 ms │  1.06x slower │
│ QQuery 10    │   334.82 ms │           353.83 ms │  1.06x slower │
│ QQuery 11    │   387.78 ms │           398.41 ms │     no change │
│ QQuery 12    │  1223.90 ms │          1283.75 ms │     no change │
│ QQuery 13    │  1904.77 ms │          1922.12 ms │     no change │
│ QQuery 14    │  1213.00 ms │          1253.31 ms │     no change │
│ QQuery 15    │  1168.22 ms │          1267.19 ms │  1.08x slower │
│ QQuery 16    │  2474.37 ms │          2529.90 ms │     no change │
│ QQuery 17    │  2468.18 ms │          2498.91 ms │     no change │
│ QQuery 18    │  5046.85 ms │          4795.28 ms │     no change │
│ QQuery 19    │   119.41 ms │           120.04 ms │     no change │
│ QQuery 20    │  1914.88 ms │          1851.77 ms │     no change │
│ QQuery 21    │  2173.75 ms │          2142.49 ms │     no change │
│ QQuery 22    │  3714.84 ms │          3659.76 ms │     no change │
│ QQuery 23    │ 15000.35 ms │         12234.04 ms │ +1.23x faster │
│ QQuery 24    │   221.04 ms │           208.55 ms │ +1.06x faster │
│ QQuery 25    │   467.77 ms │           463.54 ms │     no change │
│ QQuery 26    │   218.51 ms │           203.90 ms │ +1.07x faster │
│ QQuery 27    │  2723.56 ms │          2673.27 ms │     no change │
│ QQuery 28    │ 23601.79 ms │         24220.30 ms │     no change │
│ QQuery 29    │   944.25 ms │           975.80 ms │     no change │
│ QQuery 30    │  1303.28 ms │          1268.70 ms │     no change │
│ QQuery 31    │  1313.59 ms │          1286.79 ms │     no change │
│ QQuery 32    │  4755.09 ms │          4150.85 ms │ +1.15x faster │
│ QQuery 33    │  5426.56 ms │          5487.61 ms │     no change │
│ QQuery 34    │  5749.85 ms │          5581.32 ms │     no change │
│ QQuery 35    │  1867.04 ms │          1908.30 ms │     no change │
│ QQuery 36    │    65.18 ms │            66.12 ms │     no change │
│ QQuery 37    │    43.54 ms │            43.16 ms │     no change │
│ QQuery 38    │    65.72 ms │            67.94 ms │     no change │
│ QQuery 39    │    99.31 ms │           100.11 ms │     no change │
│ QQuery 40    │    25.53 ms │            24.57 ms │     no change │
│ QQuery 41    │    21.87 ms │            22.26 ms │     no change │
│ QQuery 42    │    19.85 ms │            19.18 ms │     no change │
└──────────────┴─────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 93961.67ms │
│ Total Time (speedup_accumulate2)   │ 91242.09ms │
│ Average Time (HEAD)                │  2185.16ms │
│ Average Time (speedup_accumulate2) │  2121.91ms │
│ Queries Faster                     │          4 │
│ Queries Slower                     │          5 │
│ Queries with No Change             │         34 │
│ Queries with Failure               │          0 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ speedup_accumulate2 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 118.58 ms │           100.83 ms │ +1.18x faster │
│ QQuery 2     │  26.69 ms │            29.17 ms │  1.09x slower │
│ QQuery 3     │  36.70 ms │            37.95 ms │     no change │
│ QQuery 4     │  28.89 ms │            28.55 ms │     no change │
│ QQuery 5     │  84.60 ms │            85.80 ms │     no change │
│ QQuery 6     │  19.72 ms │            19.79 ms │     no change │
│ QQuery 7     │ 228.01 ms │           219.05 ms │     no change │
│ QQuery 8     │  34.07 ms │            35.34 ms │     no change │
│ QQuery 9     │  95.50 ms │           101.36 ms │  1.06x slower │
│ QQuery 10    │  60.55 ms │            61.89 ms │     no change │
│ QQuery 11    │  17.42 ms │            17.84 ms │     no change │
│ QQuery 12    │  49.90 ms │            49.42 ms │     no change │
│ QQuery 13    │  46.57 ms │            46.79 ms │     no change │
│ QQuery 14    │  13.43 ms │            13.17 ms │     no change │
│ QQuery 15    │  24.26 ms │            23.92 ms │     no change │
│ QQuery 16    │  23.64 ms │            23.75 ms │     no change │
│ QQuery 17    │ 146.32 ms │           140.88 ms │     no change │
│ QQuery 18    │ 267.88 ms │           263.49 ms │     no change │
│ QQuery 19    │  36.76 ms │            36.56 ms │     no change │
│ QQuery 20    │  47.63 ms │            48.14 ms │     no change │
│ QQuery 21    │ 290.33 ms │           311.08 ms │  1.07x slower │
│ QQuery 22    │  17.22 ms │            17.19 ms │     no change │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1714.64ms │
│ Total Time (speedup_accumulate2)   │ 1711.97ms │
│ Average Time (HEAD)                │   77.94ms │
│ Average Time (speedup_accumulate2) │   77.82ms │
│ Queries Faster                     │         1 │
│ Queries Slower                     │         3 │
│ Queries with No Change             │        18 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@Dandandan
Copy link
Contributor Author

Query 1 is consistently 15%-20% faster with this change.

@Dandandan
Copy link
Contributor Author

run benchmark tpch

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing speedup_accumulate2 (2b3eebf) to 70daf88 diff using: tpch
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and speedup_accumulate2
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ speedup_accumulate2 ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1     │ 192.45 ms │           183.81 ms │ no change │
│ QQuery 2     │  91.33 ms │            94.40 ms │ no change │
│ QQuery 3     │ 125.49 ms │           125.37 ms │ no change │
│ QQuery 4     │  75.03 ms │            74.74 ms │ no change │
│ QQuery 5     │ 171.44 ms │           171.00 ms │ no change │
│ QQuery 6     │  65.68 ms │            63.92 ms │ no change │
│ QQuery 7     │ 211.23 ms │           210.98 ms │ no change │
│ QQuery 8     │ 161.78 ms │           162.56 ms │ no change │
│ QQuery 9     │ 225.44 ms │           220.75 ms │ no change │
│ QQuery 10    │ 185.34 ms │           185.20 ms │ no change │
│ QQuery 11    │  72.67 ms │            74.63 ms │ no change │
│ QQuery 12    │ 115.36 ms │           115.43 ms │ no change │
│ QQuery 13    │ 210.35 ms │           217.54 ms │ no change │
│ QQuery 14    │  92.09 ms │            92.21 ms │ no change │
│ QQuery 15    │ 121.00 ms │           119.46 ms │ no change │
│ QQuery 16    │  56.30 ms │            55.55 ms │ no change │
│ QQuery 17    │ 267.60 ms │           262.20 ms │ no change │
│ QQuery 18    │ 307.22 ms │           304.14 ms │ no change │
│ QQuery 19    │ 133.01 ms │           133.46 ms │ no change │
│ QQuery 20    │ 125.55 ms │           122.76 ms │ no change │
│ QQuery 21    │ 255.25 ms │           260.91 ms │ no change │
│ QQuery 22    │  42.57 ms │            42.26 ms │ no change │
└──────────────┴───────────┴─────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 3304.18ms │
│ Total Time (speedup_accumulate2)   │ 3293.29ms │
│ Average Time (HEAD)                │  150.19ms │
│ Average Time (speedup_accumulate2) │  149.69ms │
│ Queries Faster                     │         0 │
│ Queries Slower                     │         0 │
│ Queries with No Change             │        22 │
│ Queries with Failure               │         0 │
└────────────────────────────────────┴───────────┘

@Dandandan Dandandan force-pushed the speedup_accumulate2 branch from fb8f6ac to 05414e8 Compare January 4, 2026 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize NullState for non-null data

2 participants