Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document block size #12

Closed
wants to merge 3 commits into from
Closed

Document block size #12

wants to merge 3 commits into from

Conversation

arj03
Copy link
Contributor

@arj03 arj03 commented Jun 5, 2018

I also changed the block size parameter in benchmark as that is the default and I don't want to give up the wrong impression (worse performance).

@dominictarr
Copy link
Collaborator

I notice that node streams use 64k blocks...

tests: 256k

name, ops/second, mb/second, ops, total-mb, seconds
append, 206282.517, 26.095, 2064888, 261.211, 10.01
stream, 97902.809, 12.384, 979126, 123.861, 10.001
stream no cache, 112938.906, 14.286, 1129502, 142.883, 10.001
stream10, 534515.548, 67.617, 5345690, 676.244, 10.001
random, 39180.627, 4.956, 391963, 49.583, 10.004

64k

append, 207204.597, 26.22, 2073082, 262.331, 10.005
stream, 97322.267, 12.315, 973320, 123.166, 10.001
stream no cache, 106819.318, 13.517, 1068300, 135.184, 10.001
stream10, 506347.865, 64.073, 5063985, 640.8, 10.001
random, 37810.318, 4.784, 378141, 47.85, 10.001

16k

name, ops/second, mb/second, ops, total-mb, seconds
append, 172345.823, 21.806, 1724837, 218.236, 10.008
stream, 90706.887, 11.476, 907341, 114.8, 10.003
stream no cache, 99001.899, 12.526, 990217, 125.286, 10.002
stream10, 516003.199, 65.287, 5160548, 652.939, 10.001
random, 37758.824, 4.777, 377626, 47.779, 10.001

it doesn't seem to make much difference on my system

@arj03
Copy link
Contributor Author

arj03 commented Jun 6, 2018

Wow that is odd. What kind of disc to you have?

I ran the benchmarks on two different machines with compareable results. Here are the results from my "fast" machine:

16kb:

name, ops/second, mb/second, ops, total-mb, seconds
append, 269938.212, 33.883, 2699922, 338.899, 10.002
stream, 184220.777, 23.123, 1842392, 231.26, 10.001
stream no cache, 205688.531, 25.818, 2057091, 258.209, 10.001
stream10, 666496.15, 83.659, 6665628, 836.68, 10.001
random, 56624.812, 7.107, 566418, 71.097, 10.003

name, ops/second, mb/second, ops, total-mb, seconds
append, 244853.687, 30.743, 2450006, 307.615, 10.006
stream, 194820.917, 24.461, 1948404, 244.636, 10.001
stream no cache, 207688.431, 26.076, 2077092, 260.794, 10.001
stream10, 656314.068, 82.405, 6563797, 824.139, 10.001
random, 41092.405, 5.159, 411787, 51.702, 10.021

256kb:

name, ops/second, mb/second, ops, total-mb, seconds
append, 254245.727, 31.927, 2544237, 319.497, 10.007
stream, 218160.683, 27.395, 2181825, 273.986, 10.001
stream no cache, 239316.768, 30.052, 2393407, 300.556, 10.001
stream10, 217253.149, 27.282, 2172966, 272.877, 10.002
random, 51636.736, 6.484, 516419, 64.849, 10.001

name, ops/second, mb/second, ops, total-mb, seconds
append, 253162.883, 31.787, 2531882, 317.904, 10.001
stream, 216889.122, 27.232, 2169325, 272.382, 10.002
stream no cache, 224141.085, 28.143, 2241635, 281.461, 10.001
stream10, 190692.153, 23.943, 1907875, 239.551, 10.005
random, 67537.846, 8.48, 675446, 84.809, 10.001

@arj03
Copy link
Contributor Author

arj03 commented Jun 6, 2018

I should have written more clearly why I think block size is important. In testing I found that block size would make up a noticeable diff, especially when combined with a fast path in aligned-block-file were we optimize for the case that the slice we need is probably in memory.

The benchmarks really depend a lot of what kind of data you feed it and messages in ssb can be quite large so it appears that it would make sense to try and bump the buffersize. While this is not such a big problem on ssd, its still an IO read that can be quite expensive compared to just reading from memory.

The benchmarks above clearly demonstrates how important it is to test things on many different machines. I wonder how we can make that more accessable to people.

@arj03
Copy link
Contributor Author

arj03 commented Jun 6, 2018

Here's from my "benchmark" machine:

256

name, ops/second, mb/second, ops, total-mb, seconds
append, 57846.161, 7.318, 578693, 73.217, 10.004
stream, 36769.646, 4.652, 367770, 46.531, 10.002
stream no cache, 42792.345, 5.414, 428223, 54.18, 10.007
stream10, 37258.322, 4.714, 372695, 47.154, 10.003
random, 35687.912, 4.515, 357236, 45.198, 10.01

name, ops/second, mb/second, ops, total-mb, seconds
append, 56924.33, 7.199, 569471, 72.026, 10.004
stream, 40642.335, 5.14, 406464, 51.409, 10.001
stream no cache, 47514.697, 6.009, 475242, 60.108, 10.002
stream10, 41018.892, 5.188, 410353, 51.9, 10.004
random, 34174.395, 4.322, 341949, 43.249, 10.006

16kb:

name, ops/second, mb/second, ops, total-mb, seconds
append, 60984.315, 7.716, 610453, 77.238, 10.01
stream, 35011.791, 4.429, 350363, 44.33, 10.007
stream no cache, 42308.469, 5.353, 423127, 53.536, 10.001
stream10, 220709.329, 27.925, 2207314, 279.285, 10.001
random, 20472.743, 2.59, 205055, 25.946, 10.016

name, ops/second, mb/second, ops, total-mb, seconds
append, 54037.17, 6.836, 540804, 68.418, 10.008
stream, 29821.453, 3.772, 298304, 37.739, 10.003
stream no cache, 34741.151, 4.395, 347481, 43.961, 10.002
stream10, 210737.352, 26.66, 2107795, 266.661, 10.002
random, 20467.253, 2.589, 204693, 25.896, 10.001

and 64kb for completeness:

name, ops/second, mb/second, ops, total-mb, seconds
append, 54037.17, 6.836, 540804, 68.418, 10.008
stream, 29821.453, 3.772, 298304, 37.739, 10.003
stream no cache, 34741.151, 4.395, 347481, 43.961, 10.002
stream10, 210737.352, 26.66, 2107795, 266.661, 10.002
random, 20467.253, 2.589, 204693, 25.896, 10.001

@dominictarr
Copy link
Collaborator

Interesting! It's just a x201 think pad with rotating disk... do you think this is due to OS caching policies? maybe we could have a thing that ran and profiled the system, then chose settings that were most performant?

@dominictarr
Copy link
Collaborator

or at least, have a thing to run and detect what sort of profile your computer has. hmm, on your machines 256k buffers are significantly faster on random reads but 16k and 64k make no difference. I'm guessing the relative size of objects in this is also a big factor in this. ssb messages have a pretty wide range of sizes. last I checked, average message size is 0.7k, but max is 8k of course.

@arj03
Copy link
Contributor Author

arj03 commented Jun 8, 2018

Ahh the old mechanical :) I replaced mine long ago, best performance upgrade I have ever done. I guess you are in the minority here ;-)

Yeah 256 is really a lot faster on my slow machine. And stream is still faster on my fast, but not as much. I'm just worried about the impact of stream10. Do you think it is a realistic case for ssb to hit that particular behaviour? And how often compared to stream? My guess is that stream and random is much more common, but I really don't know. As I said earlier the reason why I opened this is that is had real world improvements to the various benchmarks we have.

@christianbundy
Copy link
Member

This still seems like it would be nice to document, what do y'all think?

@dominictarr
Copy link
Collaborator

@christianbundy this is a good example of something that can just be merged

@arj03 arj03 closed this Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants