|
|
|
|
|
by jpollock
1756 days ago
|
|
If we change the code, getting rid of if statements, it's easier to see what happens by removing if's. I've changed BucketReduction to be a pair, and fixed the strange behavior on error. Swapping "fromBurst" and "auto_replenish" would reduce the number of throttling events. This solution does have more
branches than the original, hence readability vs performance. It's still not exception safe, auto_replenish could throw and leave the state undefined. BucketReduction reduce(long tokens) {
long fromBurst = min(tokens, burst);
burst -= fromBurst;
tokens -= fromBurst;
if (tokens > budget) {
auto_replenish();
}
long fromBudget = min(tokens, budget);
budget -= fromBudget;
tokens -= fromBudget;
long totalConsumed = fromBurst + fromBudget;
if (tokens == 0) {
return new BucketReduction(Success, totalConsumed);
}
if (totalConsumed + tokens > size) {
return new BucketReduction(OverConsumption, totalConsumed);
}
return new BucketReduction(Failure, totalConsumed);
}
|
|
I don't find the comments in the original code distracting, but I do like your version better.
> I'm also curious why burst is consumed, then budget. I would expect _budget_ to be consumed first (with refill) with overflow into burst? My expectation is for burst and budget to have different refill schedules in auto_replenish, so using burst first would result in more failures by missing refill opportunities.
This behavior is documented in the public API [0], so whatever is the reason why it was chosen, I don't think it can ever be changed.
> I don't understand why OverConsumption is different to Failure. Both will result in throttling by the caller. The reason for the difference should be documented.
My understanding is this. If the number of tokens requested is greater than the remaining budget but less than the size of the bucket, the call is rejected and the caller is blocked until it has enough tokens. But if the number of requested tokens is greater than the size of the bucket, the caller will never have enough tokens. Instead of blocking the caller forever, the rate limiter lets the call go through, but then blocks the caller for a while to compensate for the over-consumption. Here's the handling [1]. I wish it was documented better.
[0] https://github.com/firecracker-microvm/firecracker/blob/fc2e...
[1] https://github.com/firecracker-microvm/firecracker/blob/2f92...