| This is a pretty old argument and IMO it's far out of date/obsolete. Taking full control of your I/O and buffer management is great if (a) your developers are all smart and experienced enough to be kernel programmers and (b) your DBMS is the only process running on a machine. In practice, (a) is never true, and (b) is no longer true because everyone is running apps inside containers inside shared VMs. In the modern application/server environment, no user level process has accurate information about the total state of the machine, only the kernel (or hypervisor) does and it's an exercise in futility to try to manage paging etc at the user level. As Dr. Michael Stonebraker put it: The Traditional RDBMS Wisdom is (Almost Certainly) All Wrong. https://slideshot.epfl.ch/play/suri_stonebraker (See the slide at 21:25 into the video). Modern DBMSs spend 96% of their time managing buffers and locks, and only 4% doing actual useful work for the caller. Granted, even using mmap you still need to know wtf you're doing. MongoDB's original mmap backing store was a poster child for Doing It Wrong, getting all of the reliability problems and none of the performance benefits. LMDB is an example of doing it right: perfect crash-proof reliability, and perfect linear read scalability across arbitrarily many CPUs with zero-copy reads and no wasted effort, and a hot code path that fits into a CPU's 32KB L1 instruction cache. |
This is co-authored by Pavlo, Viktor Leiss, with feedback from Neumann. I'm sorry, but if someone on the internet claims to know better than those 3, you're going to need some monumental evidence of your credibility.
Additionally, what you link here:
Is discussing "Main Memory" databases. These databases do no I/O outside of potential initial reads, because all of the data fits in-memory!These databases represent a small portion of contemporary DBMS usage when compared to traditional RDBMS.
All you have to do is look at the bandwidth and reads/sec from the paper when using O_DIRECT "pread()"s versus mmap'ed IO.