| HN Mirror

I am familiar with Little's Law. The key concept is optimistic sequential execution. The overwheleming majority of requests do not need to page in cold-data or otherwise stall for too long. Also, in the case where they can stall to e.g process a lot of data in a loop, they can yield back to the scheduler, thereby being fair to all other accepted coroutines/requests and giving a chance to others to complete earlier. In practice, at least based on my prototype and tests, this works extremely well. If about 10-15% of the requests may end up blocking the thread, while the read can read kernel-cache resident data and proceed, that's really huge in terms of performance gains. The background threads that either become owners and responsible for running slow/blocking coros can rely on a work-strealing scheme to keep them all busy, and if you don't rely on them, thereby removing contention/stall from the 'fast' threads, you are going to eventually suffer stalls and latency spikes.