Hacker News new | ask | show | jobs
by anonymoushn 1778 days ago
Do you know if varnish's request coalescing allows it to send partial responses to every client? For example, if an origin server sends headers immediately then takes 10 minutes to send the response body at a constant rate, will every client have half of the response body after 5 minutes?

Thanks!

3 comments

I don’t know about Varnish, but having worked on other implementations, you would usually have a timeout on the initial lock (semaphore) to prevent a slow connection from impacting all clients.

But this is much, much harder to do once you are already streaming the response - if the time to first byte (TTFB) is quick, but the connection is low-throughout, you can’t do much at this point. But nearly all modern implementations stream the bytes to all clients immediately; they don’t try to fill the cache first (they do it simultaneously).

Some implementations might avoid fanning in too much - maintaining a smaller pool of connections rather than trying get to ”1”, but that’s ultimately a trade-off at each layer of the onion, as they can still add up.

(I worked at both Cloudflare and Google, and it was a common topic: request coalescing is a big deal for large customers)

I think the nginx that members of the public can get from their package manager does not have this feature, and will force each client other than the first to either wait for the entire body to be downloaded or wait for a timeout and hit the origin in a non-cacheable request.
I don't know for certain, but my hunch is that it streams the output to multiple waiting clients as it receives it from the origin. Would have to do some testing to confirm that though.
Varnish has defaulted to streaming responses since varnish 4. I think it gets used for a lot of video streaming use cases.