| HN Mirror

It's not that mmap per se leads to double caching, but that combining the page cache with application-level caching leads to double caching. Say you're reading hugecactus.png into your image processing program. Whether you use mmap(2) or ordinary read(2), the first step in reading hugecactus.png is the kernel DMAing the bytes into the page cache. In the mmap case, the kernel maps the page cache into your application's address space. In the read case, the kernel copies from the page cache to the application read buffer. Now suppose your application PNG-decodes hugecactus.png into RGB raster data. Now, whether you used mmap or read, the kernel has both the decoded RGB data blob and the original PNG data in memory. That's usually wasteful.

(Yes, you can reduce the severity of this problem with MADV_DONTNEED and friends.)