Hacker News new | ask | show | jobs
by lobster_johnson 5227 days ago
> This sounds like a disaster waiting to happen

Well, define "disaster". A memory-mapped dataset will cause pathological performance only when there is thrashing: If you are accessing a small percentage of the entire dataset, then unused data will be paged out and remain paged out. If the bulk of data being accessed exceeds the amount of available physical RAM, then you will get I/O trashing.

Memory-mapping is a good alternative to static allocation or a home-grown paging system because it lets the kernel handle the dynamics of allocation, letting your application transparently and gracefully handle RAM tension situation by relinquishing memory space to other apps. Kernels (including Linux and Windows) and CPUs are extremely efficient at paging I/O, much more efficient than a hand-written paging system because there's no need for the application's code to check whether a page is in physical memory — that's handled by the CPU itself.

Of course, any I/O incurred by a too-large dataset will drastically reduce performance compared to in-memory speed. But paging in itself does not necessarily lead to "disaster".

1 comments

Did you read my entire comment? I specifically said, "when the working set is larger than RAM."

Your observations, while true in the abstract, don't reflect real-world behavior under these conditions.

Working set larger than ram doesn't imply thrashing. It all depends on the page replacement algorithm employed by the OS, and the applications that are running.