|
|
|
|
|
by ggreer
3939 days ago
|
|
Your proposed solution (read from disk on startup and never again) is really a memory-backed data store, not a cache. Caches can miss. But let's analyze your example. If disk reads take tens of seconds and memory usage is high enough to purge the kernel's disk cache, nothing can save you. Had your process read in everything at the start, it would be using even more memory. Given the same load, one of two things will happen: 1. If you have swap enabled, parts of your process's memory will be swapped-out. Accessing "memory" in this case would cause a page fault and tens of seconds of delay. 2. If you have swap disabled, the OOM-killer will reap your process. When it respawns, it's going to read lots of stuff from disk... and disk reads take tens of seconds. Oops. Even if an application-level data cache improved performance on heavily-loaded shared hosts, the added costs of software development and maintenance far exceed the cost of better hardware. Hardware is cheap. Developers are expensive. |
|
On the same machine is a batch process which is reading a 1TB file and writing another 1TB file. If your serving process was reliant on the OS page cache, it would find that its pages were routinely evicted in favor of the batch process.
You're right about swap, that's why only a crank would enable swap. The moment at which swap was a reasonable solution was already behind us 20 years ago.