Hacker News new | ask | show | jobs
by colechristensen 1832 days ago
ZFS likes RAM and uses it to get better performance (and don't think about using dedup without huge ram), but you don't need it and can change the defaults.

ECC tends to attract zealots after a perfect error-free existence which ECC does tend towards but doesn't deliver, it just reduces errors. I personally don't care about a tiny amount of bit rot (zfs will prevent most of this) and rebooting my storage machine now and then.

You can run ZFS/freenas on a crappy old machine and you'll be just fine as long as you aren't hosting storage for dozens of people and you aren't a digital archivist trying to keep everything for centuries.

Real advice:

* Mirrored vdevs perform way better than raidz, I don't think the storage gain is worth it until you have dozens of drives

* Dedup isn't worth it

* Enable lz4 compression everywhere

* Have a hot spare

* You can increase performance by adding a vdev set and by adding RAM

* Use drives with the same capacity

3 comments

> Dedup isn't worth it

To add to that, ZFS dedup is a lie and you should forget its existence unless you have a very specific scenario of being a SAN with a massive amount of RAM, and even then, you had better be damn sure.

I really wish ZFS had either an option to store the Dedup Table on a NVMe like Optane, or to do an offline deduplication job.

It does have the former, these days - the "allocation_classes" feature lets you make the permanent home of certain subsets of data on "special" vdevs - which includes methods of specifying "store dedup table there".

Now, that becomes the only place entries on it are stored, so you best make it redundant if you don't want to lose your pool from a single NVMe failing, but the feature is there.

The latter I would predict seeing approximately when the sun burns out, on ZFS. It _really_ doesn't like the idea of data changing locations retroactively.

Thanks for this. I completely missed this feature in the run up to 0.8.

I'm going to have to do some test setups with this.

> Enable lz4 compression everywhere

Is the perf penalty low enough now that it just doesn't matter? I've always disabled compression on datasets I know are going to store only high-entropy data, like encoded video, that has a poor compression ratio.

I second the hot spare recommendation many times over. It can save your bacon.

It's generally the other way around actually, aside from storing already highly compressed datasets (e.g. video). The compression from lz4 will get you better effective performance because of the lower amount of io that has to be done, both in throughput and latency on zfs. This is because your CPU can usually do lz4 at hundreds of gb/s compared to the dozen you might get on your spinning rust disks.
Neat! Makes sense.
Does rebooting help with soft errors in non-ECC RAM? I would have thought bit flips would be transient in nature, but I'm not really familiar.
Running ZFS (FreeNAS/TrueNAS) on 2 home made NAS devices for years and years, I can say it is rock solid without ever using ECC RAM due to lack of choices. I can bet there were many soft-errors in all these years, but so far I never had problems that could not be recovered; the biggest issue ever was destroying the boot USB storage in months, but that was partially solved lately, I moved to fixed drives as boot drive and later I moved to virtualization for boot disk and OS, so the problem completely went away.
occasionally a bit flip will corrupt the state of something important and long running, a reboot will obviously clear this

usually it will hit nothing and have no side effects