Hacker News new | ask | show | jobs
by sciurus 1778 days ago
I'll echo what Simon said; we share some experiences here. There's a potential footgun, though, anyone getting started with this should know about-

Request coalescing can be incredibly beneficial for cacheable content, but for uncacheable content you need to turn it off! Otherwise you'll cause your cache server to serialize requests to your backend for it. Let's imagine a piece of uncacheable content takes one second for your backend to generate. What happens if your users request it at a rate of twice a second? Those requests are going to start piling up, breaking page loads for your users while your backend servers sit idle.

If you are using Varnish, the hit-for-miss concept addresses this. However, it's easy to implement wrong when you start writing your own VCL. Be sure to read https://info.varnish-software.com/blog/hit-for-miss-and-why-... and related posts. My general answer to getting your VCL correct is writing tests, but this is a tricky behavior to validate.

I'm unsure how nginx's caching handles this, which would make me nervous using the proxy_cache_lock directive for locations with a mix of cacheable and uncacheable content.

2 comments

And to add the last big one from the trifecta:

Know how to deal with cacheable data. Know how to deal with uncacheable data. But by all means, know how to keep them apart.

Accidentally caching uncacheable data has lead so some of the most ugly and avoidable data leaks and compromises in recent times.

If you go down the "route everything through a CDN route (that can be as easy as ticking a box in the Google Cloud Platform backend), make extra sure to flag authenticated data as cache-control: private / no-cache.

no-cache does not mean content must not be cached - in fact, it specifies the opposite!

no-cache means that the response may be stored in any cache, but cached content MUST be revalidated before use.

public means that the response may be cached in any cache even if the response was not normally cacheable, while private restricts this to only the user agent's cache.

no-store specifies that this response must not be stored in any cache. Note that this does not invalidate previous cached responses from being used.

max-age=0 can added to no-store to also invalidate old cached responses should one have accidentally sent a cacheable response for this resource. No other directives have any effect when using no-store.

That’s the best synopsis of the cache options I’ve ever read. It’s one of those things I have to pull documentation on every time I use it, but the way you just explained it makes so much sense that I might just memorize it now.

Edit: And now I see that you just copied bits from the Moz Dev page. I'll have to start using those more. I think the MS docs always come up first in Google.

MDN docs are quite good at times. And yes, certain parts were copy pasted in, as I didn't want to accidentally end up spreading misinformation.

Also note that I only mentioned the usual suspects - there are many more options, like must-revalidate.

Speaking of non-cacheable data:

https://arstechnica.com/gaming/2015/12/valve-explains-ddos-i...

Caching is HARD.