| > Very few people on this thread read and understood the article. Hmm. I read the article and I think I understood it. However, in my experience, you run out of RAM if and only if your working set is too big. In my experience, all involved find it desirable to reduce the size of the working set as quickly as possible. Your experience seems to differ. > The point isn't working with data sets larger than RAM. The point is making better use of the RAM you do have by taking pages you'll almost never touch and spilling them to disk so that there's more room in RAM for pages you will touch. Your reasoning is too sloppy. It supports neither your blanket statements nor your pained analogy. You appear to presuppose that: (1) The kernel can predict which pages the user will "almost never touch." (2) Mispredicting which pages will be "almost never touched" is of relatively low cost. (3) Swapping pages that the user will "almost never touch" to disk frees up an appreciable amount of RAM. (4) When pulling those pages back from disk, the work held up is, on average, less important than whatever we got to do with the RAM in the meantime. I disagree with (1). Like I said elsewhere in the comments on this article, the kernel cannot reliably predict whether a process will "almost never touch" a given page. The kernel does not have sufficiently detailed knowledge of the process's purpose or access patterns. I also disagree with (2). The consequences of getting these predictions wrong seem to be very bad. When lots of mispredictions happen in a tight cluster, the kernel and all running processes will be stopped when the user forcibly bounces the machine. If you let the OOM killer run instead of swapping, the kernel stays up and only a few running processes die. Having a working set whose size is larger than RAM but smaller than RAM + swap seems to be a recipe for a very long cluster of such mispredictions and a human intervention. I am curious to hear about workloads where (3) occurs. (Non-latency-sensitive Java code that doesn't churn objects too fast? You've allocated a heap of a certain size, and the half or so that's free doesn't get disturbed too much.) Regarding (4), even if the kernel could reliably predict cold pages, "page will almost never be touched" isn't necessarily the right criterion for swapping a page to disk. What if reading from the page will be on the critical path for something users do care about, such as logging in and killing a misbehaving process? |
> I disagree with (1). [...] The kernel does not have sufficiently detailed knowledge of the process's purpose or access patterns.
You're in for quite a surprise, particularly on desktop. I have a number of processes with some pages swapped out, and I see no impact on interacting with the said processes. Firefox, gDesklets, a volume changer, and several instances of rxvt are among them.
> (2) Mispredicting which pages will be "almost never touched" is of relatively low cost.
> I also disagree with (2). The consequences of getting these predictions wrong seem to be very bad.
Only in the case of repeated mispredictions, which only happens if you really have low RAM and are on a good way to invoke OOM killer anyway. With (1) being quite accurate (mainly because swapping out unused pages is not that aggressive), (2) magically becomes true as well.