Hacker News new | ask | show | jobs
by zzzcpan 2986 days ago
You are still bounded by just 2x the amount of requests, so no, this cannot take down the system, only slow it down a bit at worst. But not really, since you always need to have enough capacity for more than 2x load.

However, in my experience latencies are not static and depend on how far away the request is sent, the type of the resource requested, the size of the resource, current network load in that direction and other factors. Which gets tricky and complicated. At some point you need to store latest latency history for each request per each size group per each resource type per each node and dynamically calculate 90th percentile latency. But then things like size may not be predictable, so you may need to cap response sizes to a sufficiently small value. And so on.

If your responses are small, it's easier to just always send two requests in parallel to different servers and choose the fastest one.