Hacker News new | ask | show | jobs
by thrownaway2424 3933 days ago
Here's an example. You have a 100MB C++ executable that needs 4GB for its own various purposes and 20GB of data that it's serving. The machine has 64GB of memory. If you allocate 24.1GB of memory to the container for this service, disable swap, and mlock the binary and the data files, nothing will go wrong.

On the same machine is a batch process which is reading a 1TB file and writing another 1TB file. If your serving process was reliant on the OS page cache, it would find that its pages were routinely evicted in favor of the batch process.

You're right about swap, that's why only a crank would enable swap. The moment at which swap was a reasonable solution was already behind us 20 years ago.

2 comments

In that example, I'm pretty sure forgoing containers and mlock would result in similar performance while using less memory. Process startup time would also be significantly improved. (If there's such high contention for disk I/O, reading 20GB on startup is going to take a very long time.)

The kernel's page cache eviction strategy is smarter than naïve LRU. On the first read, a page is placed in the inactive file list. If it's read again, it's moved to the active file list. Pages in the inactive file list are purged before the active file list.[1] So large sequential reads may cause disk contention, but they won't massacre the file cache.

This I/O situation isn't uncommon. Consumer systems also have big batch jobs that can pollute file caches: large copies, rsyncs, backup software (Déjà Dup, Time Machine, etc). They don't solve this with containers, limits, and mlock()ing. Some programs add a couple calls to fadvise(), using the FADV_NOREUSE or FADV_DONTNEED flags.[2] But for the most part, doing nothing yields excellent performance. Operating systems are pretty good at their job.

1. https://www.kernel.org/doc/gorman/html/understand/understand...

2. This is handy for applications like bittorrent, where multiple reads of the same page are possible, but caching isn't desired.

If only O_STREAMING had made it to the kernel... https://lwn.net/Articles/12100/