Hacker News new | ask | show | jobs
by imaginenore 4276 days ago
Varnish left a sour taste in my mouth. We have used it on one of our high-traffic sites, and had some truly bizarre hard-to-reproduce problems with it.

The major bug it had was that it would work normally for hours, but then it would randomly let the flood of requests through basically killing our servers. It happened over and over and over again. We had multiple talented sysadmins look at it, and none of them could give us any explanation. The only solution was to restart it, and warm the cache all over again. We couldn't figure out what sets it off, it looked just so random.

2 comments

That sounds like a "hit-for-pass" scenario: Something from your backend told Varnish "Don't cache this" and varnish stopped doing so.
This. The thing to remember is that you always have to set beresp.ttl in vcl_fetch in a hit-for-pass situation. Varnish caches the decision to hit-for-pass (or lookup or whatever), so if you do a hit-for-pass and your TTL is 1 hour, Varnish will hit-for-pass that cache key for the next hour without running your VCL logic again.
Except we didn't have that. This was 99.9% content pages dynamically generated and nearly static. They all had roughly the same headers that haven't changed really, and they were not even aware of Varnish.

Response headers is the first thing we checked, there are only a few of them that affect Varnish:

https://www.varnish-software.com/static/book/HTTP.html

Are you sure it wasn't an issue with cache evictions?
All pages at the same exact time?
Next time call me and I will make sure that the best minds doing cache invalidation in the industry have a look and fix your issue.

[Varnish Software sales hat on]