-
Notifications
You must be signed in to change notification settings - Fork 297
XAPI throttling proof of concept #6778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/throttling
Are you sure you want to change the base?
XAPI throttling proof of concept #6778
Conversation
The token bucket library implements the token bucket algorithm, to be used for rate-limiting. This commit implements basic token buckets, which contain tokens that are refilled over time according to their refill parameter, up to a maximum determined by the burst parameter. Tokens can be consumed in a thread-safe way - consuming returns false when there are not enough tokens available, and true when the operation was successful. Signed-off-by: Christian Pardillo Laursen <[email protected]>
Signed-off-by: Christian Pardillo Laursen <[email protected]>
Bucket tables map client identifiers to their token buckets, and are the main data structure for rate limiting. Signed-off-by: Christian Pardillo Laursen <[email protected]>
To be replaced with a proper datamodel. Bucket tables are used for mapping requests to their respective token bucket so that they can be rate limited. Signed-off-by: Christian Pardillo Laursen <[email protected]>
Signed-off-by: Christian Pardillo Laursen <[email protected]>
Signed-off-by: Christian Pardillo Laursen <[email protected]>
Signed-off-by: Christian Pardillo Laursen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit tough to understand how the datastructure does rate-limitting. I had to find documentation online to undestand it. As such, I would appreciate a more in-depth explanation at the header of the mli file. For example, it would be good to understand when is the "refill" timestamp changed, and explain that some of the methods are only meant to be used for testing.
| (** Create token bucket with given parameters and supplied inital timestamp | ||
| @param timestamp Initial timestamp | ||
| @param burst_size Maximum number of tokens that can fit in the bucket | ||
| @param fill_rate Number of tokens added to the bucket per second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fill_rate has an implicit unit (Hz), I think that it can be worth representing the fill rate as the amount of time it takes the bucket from being empty to becoming full. This is a timespan, and we can use a known datatype with explicit unit here: Mtime.span.
How some operations would be changed:
let peek_with_delta time_delta tb =
let fill_time = Mtime.Span.to_float_ns tb.fill_time in
let time_delta = Mtime.Span.to_float_ns time_delta in
min tb.burst_size (tb.tokens +. (time_delta /. fill_time)
let delay_until_available_with_delta delta tb amount =
let current_tokens = peek_with_delta delta tb in
let required_tokens = max 0. (amount -. current_tokens) in
required_tokens *. (Mtime.Span.to_float_ns tb.fill_time)
|> Float.to_int64
|> Mtime.Span.of_unit64_ns There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see the appeal of adding units to the fill_rate but I'd rather keep the burst size decoupled from fill rate, and I think it's more intuitive as tokens per second than seconds per token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this fail, i.e., return None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in some extreme cases, if you have an arithmetic overflow. Although that'll be like an Mtime span of ~584 years.
| burst_size: float | ||
| ; fill_rate: float | ||
| ; mutable tokens: float | ||
| ; mutable last_refill: Mtime.span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason this cannot be a counter? They return the time difference directly: https://ocaml.org/p/mtime/latest/doc/mtime.clock/Mtime_clock/index.html#counters
This would mean that the peek functions would turn into:
let peek_with_delta time_delta tb =
let time_delta_seconds = Mtime.Span.to_float_ns time_delta *. 1e-9 in
min tb.burst_size (tb.tokens +. (time_delta_seconds *. tb.fill_rate))
let peek tb = peek_with_delta (Mtime_clock.count tb.last_refill) tbThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping a counter does simplify things a bit, but if I understand correctly it forces us to make two system time calls when consuming tokens - one to obtain the difference from the counter, and another to produce a new counter. It probably won't make much of a difference, I can try profiling.
Signed-off-by: Christian Pardillo Laursen <[email protected]>
|
Added some documentation to the token bucket module to explain the rate limit application. |
Zero or negative rate limits can cause issues in the behaviour of rate limiting. In particular, zero fill rate leads to a division by zero in time calculations. Rather than account for this, we forbid the creation of token buckets with a bad fill rate by returning None. Signed-off-by: Christian Pardillo Laursen <[email protected]>
b9098c1 to
dd10c1c
Compare
Signed-off-by: Christian Pardillo Laursen <[email protected]>
dd10c1c to
a540ca7
Compare
Make token bucket type abstract to hide Hashtbl.t Use `replace` rather than `add` for adding a new bucket Signed-off-by: Christian Pardillo Laursen <[email protected]>
28cfa02 to
532a6df
Compare
Signed-off-by: Christian Pardillo Laursen <[email protected]>
532a6df to
d6b68df
Compare
The current implementation of rate limiting had severe fairness issues. These have been resolved through the addition of a request queue, to which rate limited requests are added. A worker thread sleeps until its associated token bucket has enough tokens to handle the request at the head of the queue, calls it, and sleeps until the next request is ready. Signed-off-by: Christian Pardillo Laursen <[email protected]>
| (** Create token bucket with given parameters and supplied inital timestamp | ||
| @param timestamp Initial timestamp | ||
| @param burst_size Maximum number of tokens that can fit in the bucket | ||
| @param fill_rate Number of tokens added to the bucket per second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this fail, i.e., return None?
| * GNU Lesser General Public License for more details. | ||
| *) | ||
|
|
||
| type rate_limit_data = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a simpler name like bucket, consumer or similar.
bd1af3a to
32092aa
Compare
31a30ff to
9c2c449
Compare
Signed-off-by: Christian Pardillo Laursen <[email protected]>
Creating a token bucket fails if the rate limit supplied is 0 or negative - this can lead to unexpected and undesirable behaviour, such as division by 0 or negative token counts. Signed-off-by: Christian Pardillo Laursen <[email protected]>
9c2c449 to
3973dc3
Compare
Signed-off-by: Christian Pardillo Laursen <[email protected]>
Signed-off-by: Christian Pardillo Laursen <[email protected]>
Signed-off-by: Christian Pardillo Laursen <[email protected]>
3973dc3 to
19f18b3
Compare
| "An identifier for the rate limited client" ~ignore_foreign_key:true | ||
| ~default_value:(Some (VString "")) | ||
| ; field ~qualifier:StaticRO ~ty:Float ~lifecycle "burst_size" | ||
| "Amount of tokens that can be consumed in one burst" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the idl should mention tokens or buckets at all, instead I would try to communicate the meaning of the parameters in a way that allows users to make a mental model of how rate limiting works:
| "Amount of tokens that can be consumed in one burst" | |
| "Amount of RPC calls that the client can do in burst" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, we shouldn't talk about token buckets and I'll change that. The plan is to assign higher token costs to more expensive calls, e.g. VM create, so we can't simplify to the level of RPC calls, but I'll figure out how to document this for users.
| "Amount of tokens that can be consumed in one burst" | ||
| ~ignore_foreign_key:true ~default_value:(Some (VFloat 0.)) | ||
| ; field ~qualifier:StaticRO ~ty:Float ~lifecycle "fill_rate" | ||
| "Tokens added to token bucket per second" ~ignore_foreign_key:true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "Tokens added to token bucket per second" ~ignore_foreign_key:true | |
| "Calls per second afforded to the client" ~ignore_foreign_key:true |
The rate limiting can no longer be set from xapi_globs. Instead, the rate limiter is initialised from the database on startup now. Signed-off-by: Christian Pardillo Laursen <[email protected]>
Signed-off-by: Christian Pardillo Laursen <[email protected]>
8a3fe58 to
ff6be8f
Compare
|
|
||
| type rate_limit_data = { | ||
| bucket: Token_bucket.t | ||
| ; process_queue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to let the caller know how long it should delay before moving ahead?
For example:
consume t amount |> Option.iter Thread.delay ;
handle ()
The schedules of different threads can be partially sorted by the delays returned to them.
Proof of concept for XAPI throttling.
This allows users to specify the user-agent rate limited clients in xapi.conf, which then consume from a token bucket whenever a request is made, and have to wait for it to refill if they exceed its capacity.