Hacker News new | ask | show | jobs
by jimis 2781 days ago
LMDB is a very good choice for many well-known reasons. I don't need to expand here, the advantages are well documented, and more and more projects are choosing LMDB.

However LMDB does not solve all problems, and can be a bad choice for some, and I couldn't find this documented anywhere. Specifically write-intensive workload. Why?

- LMDB by default provides full ACID semantics, which means that after every key-value write committed, it needs to sync to disk. Apparently if this happens tens of times per second, your system performance will suffer.

- LMDB provides a super-fast asynchronous mode (`MDB_NOSYNC`), and this is the one most often benchmarked. Writes are super-fast with this. But a little known fact is that you lose all of ACID, meaning that a system crash can cause total loss of the database. Only use `MDB_NOSYNC` if your data is expendable.

In short, I would advise against LMDB if you are expecting to have more than a couple of independent writes per second. In this case, consider choosing a database that syncs to disk only occasionally, offering just ACI semantics (without Durability, which means that a system crash can cause loss of only the last seconds of data).

3 comments

> But a little known fact is that you lose all of ACID, meaning that a system crash can cause total loss of the database. Only use `MDB_NOSYNC` if your data is expendable.

Last I looked into LMDB, this was only the case if the filesystem doesn't respect write ordering, which depends on the filesystem. Otherwise you get everything but durability (i.e. ACI) If I recall, writes are ordered by default on Ext3.

This is exactly our experience. Using the default settings, RocksDB massively outperform LMDB on single-key write workload, because it writes asynchronously.
Your advice made sense in the age of rotating platter HDDs, limited to a max of ~120 IOPS. Today's world of NVMe SSDs makes your considerations obsolete.
There's an even older technology, battery backed RAM cached HDDs, that gives you everything an SSD can, except the thing you aren't actually using here, fast random-access read performance.
That's not true. With SSDs we can sync with the disk more often, but it's still very slow.