Hacker News new | ask | show | jobs
by imiric 1832 days ago
Simplicity. There's a lot of complexity in ZFS I'd rather not depend on, and because it does so many things it's a big investment and liability to switch to.

While I understand why it would be useful in a corporate setting, for personal use I've found the combination of LUKS+LVM+SnapRAID to work well and don't see the benefit of switching to ZFS. Two of those are core Linux features, and SnapRAID has been rock solid, though thankfully I haven't tested its recovery process, but it seems straightforward from the documentation. Sure I don't have the real-time error correction of ZFS and other fancy features, but most of those aren't requirements for a personal NAS.

2 comments

What about if you were just starting today, with 0 knowledge about basically anything related to storage and how to do it right?

That's my case, I'm learning before setting up a cheap home lab and a NAS, and I'm wondering if biting into ZFS is just the best option that I have given today's ecosystem.

I was in the same place 6 or 7 years ago. Due to indecision, I ended up using btrfs, zfs, and mdadm (technically, Synology hybrid raid) on various devices. They all work, more or less.

Looking back, the lessons that come to mind are:

- Always have 2 backups (not counting the primary copy), at least 1 "cold" (inaccessible without human intervention) and at least 1 offsite. Backup frequently and retain old backups. With backups, bad decisions are reversible.

- With btrfs or zfs, using a collection of 2-disk mirrors was useful because it provided flexibility (to expand the array, just add another pair of disks) and seemed to have better performance than a single disk. Try to pair disks from different manufacturing batches though. I saw two disks from the same batch and _used in the same mirror_ fail in the same month, which was disconcerting.

- The only data corruption I had to deal with was from RAM that started off good and went bad after a couple years.

- Standardizing on btrfs or zfs from the beginning would have allowed backup by sending snapshots, which would have been a lot easier than cobbling together a solution using rsync.

- Scrub on a regular schedule. Set up monitoring software to notify you of the outcome of each scrub and of any SMART errors.

Thank you. I need to start small, otherwise I feel overwhelmed by too many moving pieces to keeo in mind and plan for.

So I'm starting small, from powering up a ThinkCentre M910 I had laying around, with an internal disk that can be used to store backups. I have 0 need for performance so my idea was to extend storage with an external USB3 HD enclosure. For now, I don't have the space nor the machine where to install dual hard disks for building a decent RAID. Time will tell.

> That's my case, I'm learning before setting up a cheap home lab and a NAS, and I'm wondering if biting into ZFS is just the best option that I have given today's ecosystem.

ZFS is the simplest stack that you can learn IMHO. But if you want to learn all the moving parts of an operating system for (e.g.) professional development, then more complex may be more useful.

If you want to created a mirrored pair of disks in ZFS, you do: sudo zpool create mydata mirror /dev/sda /dev/sdb

In the old school fashion, you first partition with gdisk, then you use mdadm to create the mirroring, then (optionally) LVM to create volume management, then mkfs.

I dove into ZFS for my home lab as a relative novice.

It's not terrible, but there are a few new concepts to come to grips with. Once you have them down, it's not terrible.

If you don't plan on raiding, IMO, ZFS is overkill. The check-summing is nice, but you can get that from other filesystems.

Maintenance is fairly straight forward. I've even done a disk swap without too much fuss.

The biggest issue I had was setting up raid z on root with ubuntu was a PITA (at the time at least, March of this year). I ended up switching over to debian instead. Once setup, things have been pretty smooth.

Two things I like from it, as per what I've read so far:

* Checksumming

* As you mention, easy maintenance

* Snapshots and how useful they are for backups

In the end what I value is stuff that works reliably, doesn't get in the way, and requiring minimal supervision. And in the particular case of FS, I'd like to adopt a system that helps avoid bitrot in my data.

Could you drop some names that you would consider as good alternatives of ZFS?

For close to ZFS feature parity but much younger, BTRFS.

Otherwise it's sort of figuring out what features you want to drop. XFS and ext4 are probably where I'd look for a single disk hard drive.

Like I said, you could do ZFS, but definitely feels a bit like overkill. Setting up a vdev with one disk just to get snapshots and checksums seems like a lot.

I would still go with a collection of composable tools rather than something monolithic as ZFS, and to avoid the learning curve. But again, for personal use. If you're planning to use ZFS in a professional setting it might be good to experiment with it at home.
As mentioned in the sibling comment, one thing I like is having systems that don't require me to supervise, fix things, etc. In part that's why I've been alwas a user of ext4, it just works.

But I've recently found bitrotin some of my data files and now that I happened to be learning about how to build a NAS, I wanted to make the jump to some FS that helps me with that task.

Could you mention which tools you would use to replace ZFS? Think of checksumming, snapshotting, and to a lesser degree, replication/RAID.

I would argue that a collection of mostly composable tools can easily be much more complex (and bug-prone!) than a single “monolith”. Less moving parts can be good sometimes and I would argue that a file system/volume management is a very compact problem domain where better integration between the tools is more important than extendibility.
> LUKS+LVM+SnapRAID

+ your fs

Yeah that sounds like a lot less complexity

ZFS has all of these features and more. If I don't need those extra features by definition it's a less complex system.

Using composable tools is also better from a maintenance standpoint. If tomorrow SnapRAID stops working, I can replace just that component with something else without affecting the rest of the system.

> If tomorrow SnapRAID stops working, I can replace just that component with something else without affecting the rest of the system.

Can you actually? If some layer of that storage stack stops working then you can no longer access your existing data, because all these layers need to work correctly to correctly reassemble the data read from disk.

It's a hypothetical scenario :) In reality if there's a project shutdown there would be enough time to migrate to a different setup. Of course it would be annoying to do, but at least it's possible. With a system like ZFS I'm risking having to change the filesystem, volume manager, storage array, encryption and whatever other feature I depended on. It's a lot to buy into.
Since all those tools are from different dev's the system gets more complex. But hey if you really think that ZFS is to complex to hold 55 petabytes because it has to many potential bugs you should tell them:

https://computing.llnl.gov/projects/zfs-lustre

Thankfully I don't have to manage 55 petabytes of data, but good luck to them.

Did you miss the part where I mentioned "for personal use"?

> Since all those tools are from different dev's the system gets more complex.

I fail to see the connection there. Whether software is developed by a single entity or multiple developers has no relation to how complex the end user system will be.

But many small tools focused on just the functionality I need allows me to build a simpler system overall.

>Did you miss the part where I mentioned "for personal use"?

Since ZFS is simpler to use then your setup, is used to store 55PB of data without a single bit error since 2012, i don't see why someone should use inferior stuff, even when it's "personal use".

>But many small tools focused on just the functionality I need allows me to build a simpler system overall.

Sometimes monoliths are better for example the network-stack and storage....maybe kernels (big Maybe here)

> Whether software is developed by a single entity or multiple developers has no relation to how complex the end user system will be.

The first part of this sentence is probably true, as far as I see, but the complexity of a system perceived by the user depends primarily on the "surface" of the system. That surface includes the UI, the documentation and important concepts you have to understand for effective usage of the system. And in that regard, ZFS wins hands down against LUKS + LVM + SnapRaid + your FS of choice. Some questions a user of that LVM stack has to answer, aren't even asked of a ZFS user. E.g. the question how to split the space between volumes or how to change the size of volumes.