|
|
|
|
|
by jedberg
2453 days ago
|
|
> Load-sensitivity is one “smart” approach. The idea is that you keep track of the load on each shard, and selectively route traffic to the lightly-loaded ones and away from the busy ones. Simplest thing is, if you have some sort of load metric, always pick the shard with the lowest value. Gotta be super careful with this one. We did this at reddit and it bit us bad. The problem was as soon as the load on a machine went down it got pounded with new requests and the load shot up, but it takes a few seconds for the load number to react to all the new requests. So we saw really bad see-saw affect. We had to add extra logic to mark how long a machine had beed at a certain load and also randomly send requests to slightly more loaded machines to keep things even. The moral of the story here is make sure you pick a metric that reacts to the change in request rate as quickly as your request rate changes! |
|
I'm curious how you reached this condition as a requirement:
> The moral of the story here is make sure you pick a metric that reacts to the change in request rate as quickly as your request rate changes!
It makes sense intuitively, but I'm having trouble proving to myself that this is necessary+sufficient.
[1]: https://en.wikipedia.org/wiki/PID_controller