Skip to content

Use PtrArrays.jl to reduce allocations in bulk sampling#162

Closed
ameligrana wants to merge 3 commits into
mainfrom
Tortar-patch-4
Closed

Use PtrArrays.jl to reduce allocations in bulk sampling#162
ameligrana wants to merge 3 commits into
mainfrom
Tortar-patch-4

Conversation

@ameligrana

Copy link
Copy Markdown
Collaborator

to compare with #161

@github-actions

github-actions Bot commented Jun 28, 2025

Copy link
Copy Markdown

Benchmark Results

f72a7bc 71b42cf f72a7bc / 71b42cf
TTFX excluding time to load 0.0524 ± 0 s 0.0526 ± 0 s 0.994 ± 0,0.98 ± 0,0.997 ± 0
code size in bytes 1.51e+04 ± 0 h 1.52e+04 ± 0 h 0.998 ± 0
code size in lines 564 ± 0 h 569 ± 0 h 0.991 ± 0
code size in syntax nodes 4.08e+03 ± 0 h 4.09e+03 ± 0 h 0.998 ± 0
constructor n=100 σ=0.1 5.78 ± 0.12 μs 5.87 ± 0.14 μs 0.997 ± 0.032,0.971 ± 0.19,0.985 ± 0.031
constructor n=100 σ=1.0 6.13 ± 0.21 μs 6.33 ± 0.25 μs 1.02 ± 0.059,1.04 ± 0.15,0.968 ± 0.051
constructor n=100 σ=10.0 6.78 ± 2.6 μs 6.72 ± 0.16 μs 1.01 ± 0.046,0.963 ± 0.36,1.01 ± 0.39
constructor n=100 σ=100.0 10.3 ± 4.1 μs 10.5 ± 3.9 μs 0.991 ± 0.35,0.982 ± 0.56,0.988 ± 0.54
constructor n=1000 σ=0.1 0.0417 ± 0.001 ms 0.0421 ± 0.0011 ms 1 ± 0.039,1.04 ± 0.086,0.99 ± 0.035
constructor n=1000 σ=1.0 0.0438 ± 0.00096 ms 0.045 ± 0.0015 ms 0.983 ± 0.047,0.968 ± 0.097,0.974 ± 0.039
constructor n=1000 σ=10.0 0.0511 ± 0.00085 ms 0.0516 ± 0.0011 ms 1 ± 0.023,0.97 ± 0.088,0.99 ± 0.026
constructor n=1000 σ=100.0 0.0598 ± 0.001 ms 0.06 ± 0.00099 ms 1.01 ± 0.027,0.998 ± 0.066,0.998 ± 0.024
constructor n=10000 σ=0.1 0.417 ± 0.018 ms 0.424 ± 0.026 ms 1 ± 0.086,0.98 ± 0.09,0.984 ± 0.073
constructor n=10000 σ=1.0 0.426 ± 0.025 ms 0.427 ± 0.017 ms 1 ± 0.086,1.09 ± 0.13,0.998 ± 0.07
constructor n=10000 σ=10.0 0.453 ± 0.032 ms 0.456 ± 0.029 ms 0.938 ± 0.11,1.02 ± 0.12,0.994 ± 0.094
constructor n=10000 σ=100.0 0.585 ± 0.023 ms 0.59 ± 0.031 ms 1 ± 0.071,1.01 ± 0.073,0.993 ± 0.065
delete ∘ rand n=100 σ=0.1 4.46 ± 0.19 μs 4.47 ± 0.2 μs 0.996 ± 0.061,1.01 ± 0.061,0.998 ± 0.062
delete ∘ rand n=100 σ=1.0 4.72 ± 0.2 μs 4.73 ± 0.19 μs 0.998 ± 0.057,1 ± 0.057,0.998 ± 0.058
delete ∘ rand n=100 σ=10.0 4.88 ± 0.18 μs 4.91 ± 0.18 μs 0.998 ± 0.052,0.996 ± 0.053,0.994 ± 0.052
delete ∘ rand n=100 σ=100.0 7.29 ± 0.48 μs 7.32 ± 0.45 μs 1 ± 0.089,0.996 ± 0.088,0.996 ± 0.09
delete ∘ rand n=1000 σ=0.1 0.0441 ± 0.00073 ms 0.0442 ± 0.00084 ms 1 ± 0.027,1.01 ± 0.031,0.998 ± 0.025
delete ∘ rand n=1000 σ=1.0 0.048 ± 0.00097 ms 0.048 ± 0.00086 ms 0.995 ± 0.029,1 ± 0.03,1 ± 0.027
delete ∘ rand n=1000 σ=10.0 0.0492 ± 0.00088 ms 0.0494 ± 0.00081 ms 0.995 ± 0.025,1.01 ± 0.028,0.996 ± 0.024
delete ∘ rand n=1000 σ=100.0 0.0541 ± 0.001 ms 0.0543 ± 0.00097 ms 1 ± 0.025,0.995 ± 0.025,0.996 ± 0.026
delete ∘ rand n=10000 σ=0.1 0.481 ± 0.014 ms 0.479 ± 0.013 ms 1 ± 0.037,1 ± 0.046,1.01 ± 0.039
delete ∘ rand n=10000 σ=1.0 0.517 ± 0.014 ms 0.513 ± 0.014 ms 1 ± 0.029,1 ± 0.041,1.01 ± 0.039
delete ∘ rand n=10000 σ=10.0 0.522 ± 0.013 ms 0.523 ± 0.014 ms 1.01 ± 0.034,1 ± 0.04,0.999 ± 0.036
delete ∘ rand n=10000 σ=100.0 0.518 ± 0.013 ms 0.515 ± 0.015 ms 1 ± 0.035,0.999 ± 0.043,1.01 ± 0.039
empty constructor 1.77 ± 0.28 μs 1.94 ± 0.17 μs 0.936 ± 0.22,1.09 ± 0.45,0.912 ± 0.17
intermixed_h n=100 σ=0.1 11.4 ± 0.86 μs 11.5 ± 0.94 μs 1.02 ± 0.13,1.01 ± 0.17,0.991 ± 0.11
intermixed_h n=100 σ=1.0 11.8 ± 1.2 μs 11.8 ± 1.1 μs 1.02 ± 0.13,1.02 ± 0.17,1 ± 0.14
intermixed_h n=100 σ=10.0 11.6 ± 1.4 μs 11.8 ± 1.9 μs 1.03 ± 0.2,1.03 ± 0.34,0.985 ± 0.2
intermixed_h n=100 σ=100.0 12.5 ± 1.3 μs 12.5 ± 1.3 μs 1.02 ± 0.14,1.04 ± 0.19,1 ± 0.15
intermixed_h n=1000 σ=0.1 0.109 ± 0.009 ms 0.115 ± 0.0084 ms 1.01 ± 0.11,1.01 ± 0.11,0.946 ± 0.1
intermixed_h n=1000 σ=1.0 0.112 ± 0.0087 ms 0.111 ± 0.0087 ms 1.02 ± 0.11,1.02 ± 0.11,1 ± 0.11
intermixed_h n=1000 σ=10.0 0.107 ± 0.0093 ms 0.105 ± 0.0087 ms 1.03 ± 0.13,1.02 ± 0.12,1.01 ± 0.12
intermixed_h n=1000 σ=100.0 0.116 ± 0.013 ms 0.115 ± 0.012 ms 1.03 ± 0.16,1.03 ± 0.15,1.01 ± 0.16
intermixed_h n=10000 σ=0.1 1.2 ± 0.19 ms 1.13 ± 0.16 ms 1.01 ± 0.2,1.04 ± 0.25,1.06 ± 0.23
intermixed_h n=10000 σ=1.0 1.19 ± 0.18 ms 1.17 ± 0.13 ms 1.02 ± 0.19,1.01 ± 0.22,1.02 ± 0.19
intermixed_h n=10000 σ=10.0 1.13 ± 0.18 ms 1.09 ± 0.17 ms 0.961 ± 0.24,1.03 ± 0.24,1.04 ± 0.23
intermixed_h n=10000 σ=100.0 1.16 ± 0.22 ms 1.14 ± 0.21 ms 1.01 ± 0.23,1.02 ± 0.21,1.02 ± 0.27
pathological 1 0.0457 ± 0.00031 μs 0.0456 ± 0.00025 μs 1 ± 0.0085,0.998 ± 0.01,1 ± 0.0087
pathological 1′ 0.178 ± 0.0018 μs 0.178 ± 0.0016 μs 0.965 ± 0.012,1.01 ± 0.015,1 ± 0.014
pathological 2 0.0636 ± 0.00039 μs 0.0636 ± 0.00026 μs 0.921 ± 0.0055,1 ± 0.0067,1 ± 0.0074
pathological 2′ 0.206 ± 0.0013 μs 0.194 ± 0.0016 μs 1.06 ± 0.011,1.03 ± 0.011,1.06 ± 0.011
pathological 2′′ 0.229 ± 0.0045 μs 0.221 ± 0.023 μs 0.999 ± 0.21,1.02 ± 0.078,1.04 ± 0.11
pathological 3 16.9 ± 0.23 ns 16.9 ± 0.23 ns 1 ± 0.019,1 ± 0.019,0.998 ± 0.019
pathological 4 0.063 ± 0.00024 μs 0.063 ± 0.00038 μs 1 ± 0.007,1 ± 0.0061,1 ± 0.0072
pathological 4′ 0.197 ± 0.0021 μs 0.2 ± 0.0016 μs 1 ± 0.014,0.989 ± 0.013,0.986 ± 0.013
pathological 4′′ 0.225 ± 0.015 μs 0.225 ± 0.0065 μs 1.01 ± 0.089,1.05 ± 0.12,1 ± 0.071
pathological 5a 0.0453 ± 0.00028 μs 0.0453 ± 0.0002 μs 0.999 ± 0.0078,0.999 ± 0.009,1 ± 0.0076
pathological 5b 0.0464 ± 0.00043 μs 0.0453 ± 0.00022 μs 1.12 ± 0.029,1.03 ± 0.096,1.02 ± 0.011
pathological 5b′ 0.341 ± 0.0036 μs 0.341 ± 0.0035 μs 0.967 ± 0.014,1.02 ± 0.014,0.999 ± 0.015
pathological 5b′′ 0.344 ± 0.0033 μs 0.345 ± 0.012 μs 1 ± 0.045,1.01 ± 0.062,0.997 ± 0.037
pathological large compaction (133380-op) 16.4 ± 1.3 ms 15 ± 0.23 ms 0.988 ± 0.021,1.01 ± 0.027,1.1 ± 0.091
pathological medium compaction (1254-op) 0.104 ± 0.013 ms 0.104 ± 0.022 ms 0.996 ± 0.14,1 ± 0.24,0.992 ± 0.25
pathological old compaction (6-op) 0.224 μs 0.212 μs 1.29,1.04,1.06
pathological small compaction (18-op) 0.924 ± 0.012 μs 0.918 ± 0.011 μs 0.993 ± 0.023,1.02 ± 0.017,1.01 ± 0.017
pathological tiny compaction (6-op) 0.287 ± 0.0029 μs 0.282 ± 0.0043 μs 0.99 ± 0.036,1 ± 0.024,1.02 ± 0.019
sample (bulk) n=1000 k=10000 σ=1 0.166 ± 0.02 ms 0.181 ± 0.02 ms 0.91 ± 0.15,0.909 ± 0.16,0.918 ± 0.15
sample (bulk) n=1000 k=10000 σ=100 0.1 ± 0.037 ms 0.113 ± 0.035 ms 0.885 ± 0.43,0.93 ± 0.44,0.891 ± 0.43
sample (bulk) n=1000 k=1000000 σ=1 16.1 ± 2.1 ms 17.2 ± 2.3 ms 0.926 ± 0.19,0.934 ± 0.17,0.934 ± 0.17
sample (bulk) n=1000 k=1000000 σ=100 9.04 ± 3.7 ms 8.63 ± 4.3 ms 0.834 ± 0.55,0.883 ± 0.53,1.05 ± 0.67
sample (bulk) n=1000000 k=10000 σ=1 0.342 ± 0.018 ms 0.479 ± 0.033 ms 0.865 ± 0.088,0.871 ± 0.15,0.713 ± 0.062
sample (bulk) n=1000000 k=10000 σ=100 0.169 ± 0.042 ms 0.175 ± 0.017 ms 0.991 ± 0.26,0.95 ± 0.32,0.966 ± 0.26
sample (bulk) n=1000000 k=1000000 σ=1 21.6 ± 1.3 ms 23.6 ± 0.5 ms 0.864 ± 0.063,0.87 ± 0.068,0.913 ± 0.06
sample (bulk) n=1000000 k=1000000 σ=100 9.83 ± 2.9 ms 9.6 ± 2.3 ms 0.853 ± 0.56,0.931 ± 0.34,1.02 ± 0.39
sample n=100 σ=0.1 25.4 ± 0.66 ns 25.5 ± 0.63 ns 1 ± 0.035,0.979 ± 0.035,0.997 ± 0.036
sample n=100 σ=1.0 29.4 ± 2.3 ns 29.6 ± 2.2 ns 0.998 ± 0.11,0.979 ± 0.11,0.994 ± 0.11
sample n=100 σ=10.0 19.4 ± 5 ns 19.4 ± 4.5 ns 1 ± 0.34,0.978 ± 0.34,0.997 ± 0.35
sample n=100 σ=100.0 16.9 ± 3.8 ns 17 ± 3.8 ns 0.998 ± 0.33,0.985 ± 0.33,0.996 ± 0.32
sample n=1000 σ=0.1 23.3 ± 6.4 ns 23.2 ± 4.6 ns 1.01 ± 0.31,0.989 ± 0.34,1 ± 0.34
sample n=1000 σ=1.0 0.0322 ± 0.0022 μs 0.0322 ± 0.0024 μs 1 ± 0.1,0.979 ± 0.097,0.998 ± 0.1
sample n=1000 σ=10.0 20.4 ± 5.2 ns 20.5 ± 5.2 ns 0.989 ± 0.35,0.969 ± 0.39,0.997 ± 0.36
sample n=1000 σ=100.0 17 ± 3.9 ns 17.1 ± 4.1 ns 1 ± 0.32,0.992 ± 0.36,0.99 ± 0.33
sample n=10000 σ=0.1 0.0318 ± 0.0012 μs 0.0318 ± 0.0013 μs 1.01 ± 0.054,0.984 ± 0.059,0.999 ± 0.055
sample n=10000 σ=1.0 0.0344 ± 0.0014 μs 0.0344 ± 0.0014 μs 1.01 ± 0.06,1 ± 0.15,0.999 ± 0.058
sample n=10000 σ=10.0 21.8 ± 6 ns 21.2 ± 4.9 ns 1.03 ± 0.38,0.96 ± 0.4,1.03 ± 0.37
sample n=10000 σ=100.0 17.1 ± 4.7 ns 17.6 ± 4.2 ns 1.01 ± 0.31,0.963 ± 0.35,0.973 ± 0.35
summarysize n=100 σ=0.1 1.19e+05 ± 0 h 1.19e+05 ± 0 h 1 ± 0
summarysize n=100 σ=1.0 1.19e+05 ± 0 h 1.19e+05 ± 0 h 1 ± 0
summarysize n=100 σ=10.0 1.19e+05 ± 0 h 1.19e+05 ± 0 h 1 ± 0
summarysize n=100 σ=100.0 1.19e+05 ± 0 h 1.19e+05 ± 0 h 1 ± 0
summarysize n=1000 σ=0.1 1.52e+05 ± 0 h 1.52e+05 ± 0 h 1 ± 0
summarysize n=1000 σ=1.0 1.52e+05 ± 0 h 1.52e+05 ± 0 h 1 ± 0
summarysize n=1000 σ=10.0 1.52e+05 ± 0 h 1.52e+05 ± 0 h 1 ± 0
summarysize n=1000 σ=100.0 1.52e+05 ± 0 h 1.52e+05 ± 0 h 1 ± 0
summarysize n=10000 σ=0.1 1.13e+06 ± 0 h 1.13e+06 ± 0 h 1 ± 0
summarysize n=10000 σ=1.0 1.13e+06 ± 0 h 1.13e+06 ± 0 h 1 ± 0
summarysize n=10000 σ=10.0 1.13e+06 ± 0 h 1.13e+06 ± 0 h 1 ± 0
summarysize n=10000 σ=100.0 1.13e+06 ± 0 h 1.13e+06 ± 0 h 1 ± 0
update ∘ rand n=100 σ=0.1 0.0827 ± 0.0022 μs 0.082 ± 0.0021 μs 1 ± 0.037,1 ± 0.037,1.01 ± 0.037
update ∘ rand n=100 σ=1.0 0.0889 ± 0.0026 μs 0.0884 ± 0.0025 μs 1 ± 0.043,1 ± 0.043,1.01 ± 0.041
update ∘ rand n=100 σ=10.0 0.0988 ± 0.0037 μs 0.0987 ± 0.0032 μs 1 ± 0.048,1 ± 0.047,1 ± 0.05
update ∘ rand n=100 σ=100.0 0.175 ± 0.015 μs 0.175 ± 0.015 μs 0.993 ± 0.12,1.01 ± 0.12,1 ± 0.12
update ∘ rand n=1000 σ=0.1 0.0827 ± 0.0022 μs 0.0824 ± 0.0021 μs 1 ± 0.036,1 ± 0.035,1 ± 0.037
update ∘ rand n=1000 σ=1.0 0.0885 ± 0.0019 μs 0.0881 ± 0.0019 μs 1 ± 0.034,1 ± 0.03,1 ± 0.03
update ∘ rand n=1000 σ=10.0 0.0961 ± 0.0018 μs 0.096 ± 0.0018 μs 1.01 ± 0.027,0.998 ± 0.029,1 ± 0.027
update ∘ rand n=1000 σ=100.0 0.171 ± 0.0065 μs 0.17 ± 0.0063 μs 1 ± 0.055,1 ± 0.056,1 ± 0.053
update ∘ rand n=10000 σ=0.1 0.0905 ± 0.0018 μs 0.0899 ± 0.0012 μs 1 ± 0.021,1 ± 0.017,1.01 ± 0.025
update ∘ rand n=10000 σ=1.0 0.0946 ± 0.0018 μs 0.0932 ± 0.001 μs 1 ± 0.018,0.996 ± 0.017,1.02 ± 0.022
update ∘ rand n=10000 σ=10.0 0.0954 ± 0.0018 μs 0.0942 ± 0.0019 μs 1 ± 0.017,0.998 ± 0.016,1.01 ± 0.028
update ∘ rand n=10000 σ=100.0 0.165 ± 0.0032 μs 0.164 ± 0.0022 μs 0.996 ± 0.02,0.999 ± 0.024,1.01 ± 0.024
time_to_load 0.0836 ± 0.001 s 0.083 ± 0.00081 s 0.998 ± 0.019,0.993 ± 0.035,1.01 ± 0.016

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@ameligrana

Copy link
Copy Markdown
Collaborator Author

Seems slower than the current implementation actually

@ameligrana ameligrana closed this Jun 28, 2025
@ameligrana

ameligrana commented Jun 28, 2025

Copy link
Copy Markdown
Collaborator Author

An interesting thing is that this PR got this bad message in Julia Pre (https://github.com/LilithHafner/WeightVectors.jl/actions/runs/15939048226/job/44964315259):

double free or corruption (!prev)
[2192] signal 6 (-6): Aborted

not sure why, but maybe could be interesting to you @LilithHafner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant