|
|
|
|
|
by anonymoushn
1778 days ago
|
|
Do you know if varnish's request coalescing allows it to send partial responses to every client? For example, if an origin server sends headers immediately then takes 10 minutes to send the response body at a constant rate, will every client have half of the response body after 5 minutes? Thanks! |
|
But this is much, much harder to do once you are already streaming the response - if the time to first byte (TTFB) is quick, but the connection is low-throughout, you can’t do much at this point. But nearly all modern implementations stream the bytes to all clients immediately; they don’t try to fill the cache first (they do it simultaneously).
Some implementations might avoid fanning in too much - maintaining a smaller pool of connections rather than trying get to ”1”, but that’s ultimately a trade-off at each layer of the onion, as they can still add up.
(I worked at both Cloudflare and Google, and it was a common topic: request coalescing is a big deal for large customers)