Hacker News new | ask | show | jobs
by BlackAura 4112 days ago
Unwarranted hostility aside...

The fact that Varnish changed over the years neither invalidates this article, nor vindicates Squid's design.

On any remotely modern system (say, 2006 or later), Squid's design is absurd. The critique in this article is spot on. Squid basically pretends that the operating system's virtual memory system and disk cache simply don't exist, and spends it's time working against them. This does cause exactly the kind of problems detailed in the article.

Of course, that's because Squid is not Varnish. Squid was designed a long time ago, with maximum portability in mind, and intended to run on operating systems with very poor VM and disk cache systems. With that in mind, Squid's design makes sense. It just doesn't make sense on newer systems.

In Varnish, all of this work was delegated to the operating system. This works very well. It's certainly a lot simpler than Squid, in addition to being a lot faster.

As long as most of your hot data can fit in the disk cache, at least. The infrequently used parts, which could well be a lot larger than the frequently used parts, can be kicked out to disk by the OS, and although reading them back in incurs a performance penalty, it's not that bad. It only really affects less commonly accessed data, and doesn't interfere with everything else.

The original varnish design works great for that. It's less good if your entire working set fits in RAM (in which case, the slightly newer malloc-based system is faster because it has lower overhead, but becomes much slower if you really need to swap).

Varnish starts to fall down if your working set doesn't fit in RAM (in which case, you're doomed regardless), or if the total cache is really huge (think somewhere in the terabyte range).

The new storage engine mostly just re-organizes the existing mmap-based caches. It has better cache eviction algorithms, which give a much higher cache hit rate, and much lower internal fragmentation. That alone accounts for nearly all of the performance benefit.

The only I/O change I can find is that it uses the write syscall to write newly cached objects to the file directly, rather than writing to the mmap file. That allows them to replace the contents of those pages atomically - the OS will just drop them into the disk cache, rather than potentially having to re-read them from disk if they happen to not be in the cache.

All of the reading, memory management and I/O is still done by the VM and disk cache systems of the OS. That hasn't changed.

1 comments

Thanks for the details. It still stays that the original Varnish design was also less than optimal, even for the computers of 2006, and that the more recent changes made it using the hardware better.