Hacker News new | ask | show | jobs
by apavlo 264 days ago
> • Memory-Mapping (mmap): We treat the database file as if it’s already in memory, eliminating the distinction between disk and RAM.

Ugh, not another one...

4 comments

Yep, another developer enthusiastically proposing mmap as an "easy win" for database design, when in reality it often causes hard-to-debug correctness and performance problems.
To be fair, I use it to share financial time series between multiple processes and as long as there is a single writer it works well. Been in production since several years.
Creating a shared memory buffer by mapping it as a file is not the same as mapping files on disk. The latter has weird and subtle problems, whereas the former just works.
To be clear, I am indeed doing mmap to the same file on disk. Not using shmap. But there is only one thread in one process writing to it and the readers are tolerant to millisecond delays.
> millisecond delays

I thought you said financial time series!

But yeah, this is a case where mmap works great - convenience, not super fast, single writer and not necessarily super durable.

> I thought you said financial time series!

Yeah it is just your average normal financial time series.

Why not though, from what I can see from the docs, these databases supposed to be static and read only. At least when you use it on a device.
Page cache reclamation is mostly single threaded. It's much simpler, than you can create in a user space, it has no weight for specific pages etc.

Traveling into kernel flushes branch predictor caches, tlb. So it's not free at all.

No issue if you know what you are doing. Not sure about the author but I know very high perf mmap systems for decades without corruption / issues (in hft/finance/payments).
Ctrl-Fd you here the moment i saw that in the article