Hacker News new | ask | show | jobs
by krenoten 2951 days ago
ALICE showed that's not always true with sqlite. Sled is being built with an extreme bias toward reliability over features, but as the readme says, it has some time to go before reaching maturity. The tests are quite good at finding new issues and deterministically replaying them, so you can help bake it in by mining bugs using the default test suite and help it get there.
1 comments

Most database designers assume that a power failure will only affect writes that are pending. Alas, for SSDs and NVMEs that's not always true. A power failure can cause all kinds of corruption. Long story short: even append-only strategies will not save you.

https://www.usenix.org/system/files/conference/fast13/fast13...

Indeed. This is why I aggressively checksum everything and pay particular attention to throwing away all data that was written after any detected corruption during recovery. This is easier with the log-only architecture. It's also totally os and filesystem agnostic. I was happily surprised yesterday when it passed tests on fuchsia :]
So you might end up throwing away everything. (I know, it's not your fault)
Users can rely on sequential recovery. At some point I'll probably write a partial recovery tool that gives you all versions of all keys that are at all present anywhere in the readable file though, which won't be much work. Typical best practices encourage moving away from single disk reliance for particularly valuable data, but this library will also work on phones etc... So it is important to support people when a wide variety of things go wrong.
That paper sounds like problems that can not be worked around in software and need hardware fixes?
You could work around it (erasure coding, fe), if you had insights on the failure mode specifics, but it's vendor specific and vendors are not exactly forthcoming.

So the only thing you can do is add checksumming schemes that allow you to detect you have been affected.

The paper is from 2013, so the situation might have improved meanwhile (I wouldn't put any money on it)