Hacker News new | ask | show | jobs
by thrownaway2424 3933 days ago
I don't think that's very good advice in a heavily-loaded shared hosting environment. A disk read could easily stall for tens of seconds, just because the kernel whimsically decided to throw out the cache (or because your server crowded its memory container). I actually don't want any server touching a disk while it's serving. Everything should be read before service begins and never again.
4 comments

Your proposed solution (read from disk on startup and never again) is really a memory-backed data store, not a cache. Caches can miss.

But let's analyze your example. If disk reads take tens of seconds and memory usage is high enough to purge the kernel's disk cache, nothing can save you. Had your process read in everything at the start, it would be using even more memory. Given the same load, one of two things will happen:

1. If you have swap enabled, parts of your process's memory will be swapped-out. Accessing "memory" in this case would cause a page fault and tens of seconds of delay.

2. If you have swap disabled, the OOM-killer will reap your process. When it respawns, it's going to read lots of stuff from disk... and disk reads take tens of seconds. Oops.

Even if an application-level data cache improved performance on heavily-loaded shared hosts, the added costs of software development and maintenance far exceed the cost of better hardware. Hardware is cheap. Developers are expensive.

Here's an example. You have a 100MB C++ executable that needs 4GB for its own various purposes and 20GB of data that it's serving. The machine has 64GB of memory. If you allocate 24.1GB of memory to the container for this service, disable swap, and mlock the binary and the data files, nothing will go wrong.

On the same machine is a batch process which is reading a 1TB file and writing another 1TB file. If your serving process was reliant on the OS page cache, it would find that its pages were routinely evicted in favor of the batch process.

You're right about swap, that's why only a crank would enable swap. The moment at which swap was a reasonable solution was already behind us 20 years ago.

In that example, I'm pretty sure forgoing containers and mlock would result in similar performance while using less memory. Process startup time would also be significantly improved. (If there's such high contention for disk I/O, reading 20GB on startup is going to take a very long time.)

The kernel's page cache eviction strategy is smarter than naïve LRU. On the first read, a page is placed in the inactive file list. If it's read again, it's moved to the active file list. Pages in the inactive file list are purged before the active file list.[1] So large sequential reads may cause disk contention, but they won't massacre the file cache.

This I/O situation isn't uncommon. Consumer systems also have big batch jobs that can pollute file caches: large copies, rsyncs, backup software (Déjà Dup, Time Machine, etc). They don't solve this with containers, limits, and mlock()ing. Some programs add a couple calls to fadvise(), using the FADV_NOREUSE or FADV_DONTNEED flags.[2] But for the most part, doing nothing yields excellent performance. Operating systems are pretty good at their job.

1. https://www.kernel.org/doc/gorman/html/understand/understand...

2. This is handy for applications like bittorrent, where multiple reads of the same page are possible, but caching isn't desired.

If only O_STREAMING had made it to the kernel... https://lwn.net/Articles/12100/
In other words, you're advocating using more memory in a shared environment just so your server can (try to) be faster? What happens if everyone else does the same? Everyone ends up competing for the limited amount of memory, and no one wins.

"I'll grab all the memory I can so others can't use it" is a horrible way to think, as anyone who has attempted to simultaneously use multiple applications written with this mindset will know. One takes most of the memory, forcing other apps into swap, and then the opposite happens when you start working with one of the others, accompanied by massive swapping slowdowns.

Can I suggest that if you've got problems including "stalls for tens of seconds on disk reads" you are almost certainly better off directing available resources towards fixing your hosting problems, rather than going down the cache rabbit hole on a hosting platform that's not really suitable for production use?

(With caveats for zero resource projects of course, but even for those I strongly suspect for many people paying $5 or $10 per month for "less crap hosting" is probably a better solution that prematurely optimising by adding caching and all it's inherent complexity to a fundamentally broken platform)

There's nothing you or I can do about the trend. They put more and more cores into a machine and the same number of disks (current-model Xeon servers have 72 threads and 1 or 2 disks), which guarantees that, at some point, the disk is highly oversubscribed.
Sure - the "race to the bottom" for hosting prices inevitably means there's going to be options like GoDaddy offering "a year's worth of webhosting for $5" which clusters 400 WordPress and Drupal sites onto a single RaspberryPi or similar, but you don't _have_ to go there.

I can understand if you're an open source developer who gets paid in Uzbeki Som or Nigerian Naira, the calculation of "do I spend a day or two putting caching in place" versus "do I spend an extra $50 or $100 per year on hosting" might lean very much the other way, but I suspect for the vast majority of HN readers, the prudent approach is "pay a hundred or two dollars a year for hosting before bothering to implement complex caching strategies".

In that sort of environment, I wouldn't be surprised if your app's internal cache ended up paged out anyway...
I don't allow swap on my machines, and I mlock executable pages, so I'd personally be surprised if anything was paged out.