Hacker News new | ask | show | jobs
by jmalicki 7 days ago
Writing to disk for every write is required, otherwise you're not durable.

Sure it's faster to never write to disk, then you reboot and you've lost data.

/dev/null is a webscale database that is even faster!

2 comments

There are a lot of use cases where you only truly need consistency, and durability can take a back seat. RocksDB for example does not fsync its WAL writes in the default configuration.

https://github.com/facebook/rocksdb/wiki/WAL-Performance#non...

If you can't at least guarantee write ordering you don't even have consistency.

Fsync is often used when the data doesn't truly need to be on disk, because there aren't very good write ordering APIs exposed, even if that's all you truly need.

read the whole article. WAL is the transaction log and the author tested correctness after a crash.
Well, the thing about reliability is that you can't really guarantee it by testing one particular scenario.

It seems to me that neither the old nor the new version of the code is really "durable" as I would understand the word. The old version made a write syscall per batch, but doesn't say it also did an fsync per batch. The new version writes data to an mmap'ed file, and calls fsync in the background.

So both versions are "durable" in the sense that written data is preserved even if the process gets killed, because it's in the OS page cache. But in both versions, a write can be completed before the data actually makes it to disk, so a power failure will lose acknowledged writes.

They tested SIGKILLing the process, they didn't test a power loss situation.
"Every batch of writes called file.Write on the write-ahead log"

You don't write to the WAL on a batch.

> the author tested correctness after a crash.

You mean the LLM?