Hacker News new | ask | show | jobs
by emfree 3482 days ago
> But in web services you often care more about the tail-end latency, the p90, p99 etc.

For sure. I think Theorem 2 in the paper implicitly addresses the latency distribution in this scheme. They're saying that in the limit of a large system, the queue length distribution at a single backend server depends only on the service time distribution (how long it takes to actually process each job) and the service discipline. So if for example job sizes are exponentially distributed and handled in FIFO order, then the wait time distribution is also exponential.

It would certainly be nice to see a more explicit discussion of the tail latency, especially in the simulations the authors did.