Hacker News new | ask | show | jobs
by handrous 1832 days ago
The "you need at least 32GB of memory and it has to be ECC, or don't even bother trying to use ZFS" crowd has done some serious harm to ZFS adoption. Sure, that's what you need if you want excellent data integrity guarantees and to use all of ZFS' advanced features. If you're fine with merely way-better-than-most-other-filesystems data integrity guarantees and using only most of ZFS' advanced features, you don't need those.
2 comments

I really don't know where the "You gotta have ECC RAM!" thing started. I've been running a ZFS RAID on Nvidia Jetson Nanos for years now and haven't had any issues at all with data integrity.

I don't see why ZFS would be more prone to data integrity issues spawning from a lack of ECC than any other filesystem.

Relevant quote from one of ZFS's primary designers, Matt Ahrens: “There's nothing special about ZFS that requires/encourages the use of ECC RAM more so than any other filesystem. ... I would simply say: if you love your data, use ECC RAM. Additionally, use a filesystem that checksums your data, such as ZFS."
Yeah, I remember reading that a few years ago.

If I were running a server farm or something, then yeah, I'd probably use ECC memory, but I think if you're running a home server, then the argument that ZFS necessitates ECC more than Ext4 or Btrfs or XFS or whatever doesn't really seem to be accurate.

> the argument that ZFS necessitates ECC more than Ext4 or Btrfs or XFS or whatever doesn't really seem to be accurate

Agreed.

> If I were running a server farm or something, then yeah, I'd probably use ECC memory, but I think if you're running a home server

Then you should still use ECC RAM, regardless of what filesystem you're using.

No, really. ECC matters (https://news.ycombinator.com/item?id=25622322) generally.

Fair enough, though AFAIK none of the SBC systems out there have ECC, and I generally use SBCs due to the low power consumption.
Years ago I saw it at:

https://www.truenas.com/community/threads/ecc-vs-non-ecc-ram...

(the gist of the scary story is that faulty ram while scrubbing might kill "everything".) However, in the end ECC appears to NOT be so important, e.g., see

https://news.ycombinator.com/item?id=23687895

There is literally only one feature that uses massive amounts of memory. Online de duplication relies on keeping an in ram table of duplicated blocks. This means that more duplication you have the larger the table is.

FreeBSD Mastery: ZFS by Michael Lucas around pg 174

Deduplication Memory Needs ==========================

"For a rough-and-dirty approximation, you can assume that 1 TB of deduplicated data uses about 5 GB of RAM. You can more closely approximate memory needs for your particular data by looking at your data pool and doing some math. We recommend always doing the math and computing how much RAM your data needs, then using the most pessimistic result. If the math gives you a number above 5 GB, use your math. If not, assume 5 GB per terabyte."

https://www.tiltedwindmillpress.com/?product=fmzfs

This is not to say you need 5GB for every 1TB of data. It doesn't even mean you need 5GB of data for every 1TB for which you have enabled dedup it means you need approximately 5GB of data for each TB of data which is both duplicated and residing on a dataset for which you have enabled dedup. Because of the high memory cost of dedup which rises exactly in proportion to its utility its only useful in cases in which you can plan ahead for its requirements. 99% of users are unlikely to use dedup however this doesn't stop some, not you obvious, from promoting the idea that ZFS requires 5GB of memory per TB or some some absurd figure.

As an aside I really liked the book I found it easy to read and understand and very informative despite being focused on FreeBSD its mostly applicable to Linux as well.