Hacker News new | ask | show | jobs
by ffsm8 638 days ago
I tried it out on my homelab server right after the merge into the Linux kernel.

Took roughly one week for the whole raid to stop mounting because of the journal (8hdd, 2 ssd write cache, 2 nvme read cache).

The author responded on Reddit within a day, I tried his fix, (which meant compiling the Linux kernel and booting from that), but his fix didn't resolve the issue. He sadly didn't respond after that, so I wiped and switched back to a plain mdadmin raid after a few days of waiting.

I had everything important backed up, obviously (though I did lose some unimportant data), but it did remind me that bleeding edge is indeed ... Unstable

The setup process and features are fantastic however, simply being able to add a disk and flag it as read/write cache feels great. I'm certain I'll give it another try in a few years, after it had some time in the oven.

1 comments

New filesystems seems to have a chicken and egg problem really. It's not like switching from Nvidia's proprietary drivers to nouveau and then back if it turns out they don't work that well. Switching filesystems, especially in larger raid setups where you desperately need more testing and real world usage feedback, is pretty involved, and even if you have everything backed up it's pretty time consuming restoring everything should things go haywire.

And even if you have the time and patience to be one of these early adopters, debugging any issues encountered might also be difficult, as ideally you want to give the devs full access to your filesystem for debugging and attempted fixes, which is obviously not always feasible.

So anything beyond the most trivial setups and usage patterns gets a miniscule amount of testing.

In an ideal world, you'd nail your FS design first try, make no mistakes during implementation and call it a day. I'd like to live in an ideal world.

> In an ideal world, you'd nail your FS design first try, make no mistakes during implementation and call it a day

Crypto implementations and FS implementations strike me as the ideal audience for actually investing the mental energy in the healthy ecosystem we have of modeling and correctness verification systems

Now, I readily admit that I could be talking out of my ass, given that I've not tried to use those verification systems in anger, as I am not in the crypto (or FS) authoring space but AWS uses formal verification for their ... fork? ... of BoringSSL et al https://github.com/awslabs/aws-lc-verification#aws-libcrypto...

A major chunk of storage reliability is all these weird and unexpected failure modes and edge cases which are not possible to prepare for, let alone write fixed specs for. Software correctness assumes the underlying system behaves correctly and stays fixed, which is not the case. You can't trust the hardware and the systems are too diverse - this is the worst case for formal verification.