Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected performance degradation when publishing with publisher confirms #73

Open
vitaly-krugl opened this issue May 14, 2015 · 6 comments

Comments

@vitaly-krugl
Copy link

I ran several publish tests on my recent model development MacBookPro using the test suite from https://github.com/vitaly-krugl/amqp-perf and discovered a significant unexpected performance degradation when publishing with Publisher Confirms enabled.

RabbitMQ broker was at version 3.5.1, running locally with default configuration.
rabbitpy was version 0.26.2; also got the same results with rabbitpy master 75d6b75

Two types of runs of the publish test were conducted. They differed only in whether Publisher Confirms were enabled. The other parameters were fixed at message size of 1024 bytes and 10,000 messages.

The core logic of the test opens connection, opens channel, enables publisher confirms (or not), and enters the publishing loop where it publishes all the messages one after another using default properties (None).

rabbitpy without publisher confirms

time python rabbitpy_perf.py publish --impl=Channel --exg="" --size 1024 --msgs=10000
real    0m2.456s
user    0m1.693s
sys 0m1.095s

rabbitpy with publisher confirms

time python rabbitpy_perf.py publish --impl=Channel --exg="" --size 1024 --msgs=10000 --delivery-confirmation
real    0m15.346s
user    0m5.039s
sys 0m1.220s

Note that with Publisher Confirms enabled, real time (mostly sleeping) went up a lot: over 6x. User time (mostly spinning) went up a lot, too: over 3x

In addition, compare these results with results from comparable tests using pika.SelectConnection from pika master pika/pika@dc9db2b, and note the stark performance difference:

pika.SelectConnection without publisher confirms

time python pika_perf.py publish --impl=SelectConnection --exg="" --size 1024 --msgs=10000
real    0m1.220s
user    0m0.884s
sys 0m0.331s

pika.SelectConnection with publisher confirms

time python pika_perf.py publish --impl=SelectConnection --exg="" --size 1024 --msgs=10000 --delivery-confirmation
real    0m1.716s
user    0m1.365s
sys 0m0.346s

Note that with pika.SelectConnection, both the absolute times are much shorter than with rabbitpy and also the difference between the no-confirms and publisher confirms is much smaller than in the rabbitpy case.

Finally, below are the results from pika.SynchronousConnection (https://github.com/vitaly-krugl/pika/tree/pika-synchronous) that is a pika.BlockingConnection-like interface built on top of pika.SelectConnection. Just as with rabbitpy, pika.SynchronousConnection's basic_publish() waits for Basic.Ack before returning.

pika.SynchronousConnection without publisher confirms

time python pika_perf.py publish --impl=SynchronousConnection --exg="" --size 1024 --msgs=10000
real    0m1.622s
user    0m1.239s
sys 0m0.364s

pika.SynchronousConnection with publisher confirms

time python pika_perf.py publish --impl=SynchronousConnection --exg="" --size 1024 --msgs=10000  --delivery-confirmation
real    0m2.918s
user    0m2.011s
sys 0m0.470s
@gmr
Copy link
Owner

gmr commented May 18, 2015

Does your sync code wait to publish the next message until the confirm is received?

Edit: Hmm I see that you say it does, I'll try to recreate and confirm the same behavior on my end.

@gmr
Copy link
Owner

gmr commented May 19, 2015

Confirmed behavior in 0.26, investigating

@gmr
Copy link
Owner

gmr commented May 19, 2015

After profiling, much of the time is spent in time.sleep() in the Queue.get() with a timeout. This is a trade-off issue.

Lowering the timeout too low and people complain, raise the timeout and people complain. I added a blocking_read flag when constructing a channel around 0.20 that increases performance.

In my tests, enabling it dropped the publish time from 14s on 10k messages to 5s on 10k messages.

Of the 14s, it looked like 12s of the profile was in time.sleep.

@gmr
Copy link
Owner

gmr commented May 19, 2015

With blocking_read enabled on the channel, 3.6 seconds is attributed to thread lock acquisition

@gmr
Copy link
Owner

gmr commented May 19, 2015

Refactoring a bit to try and use collections.deque to see if there's a performance bump to be found there

@gmr
Copy link
Owner

gmr commented May 19, 2015

Well that was a bit of a failure, much worse performance there. For now I'm going to have I don't see any real gains to be made in the current blocking code structure. I do have some ideas of how I could make this better but it'll be a significant refactor I can't address at the moment.

I'd suggest passing in True when creating the channel and see where that gets you with your test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants