Hacker News new | ask | show | jobs
by j1elo 1832 days ago
What about if you were just starting today, with 0 knowledge about basically anything related to storage and how to do it right?

That's my case, I'm learning before setting up a cheap home lab and a NAS, and I'm wondering if biting into ZFS is just the best option that I have given today's ecosystem.

4 comments

I was in the same place 6 or 7 years ago. Due to indecision, I ended up using btrfs, zfs, and mdadm (technically, Synology hybrid raid) on various devices. They all work, more or less.

Looking back, the lessons that come to mind are:

- Always have 2 backups (not counting the primary copy), at least 1 "cold" (inaccessible without human intervention) and at least 1 offsite. Backup frequently and retain old backups. With backups, bad decisions are reversible.

- With btrfs or zfs, using a collection of 2-disk mirrors was useful because it provided flexibility (to expand the array, just add another pair of disks) and seemed to have better performance than a single disk. Try to pair disks from different manufacturing batches though. I saw two disks from the same batch and _used in the same mirror_ fail in the same month, which was disconcerting.

- The only data corruption I had to deal with was from RAM that started off good and went bad after a couple years.

- Standardizing on btrfs or zfs from the beginning would have allowed backup by sending snapshots, which would have been a lot easier than cobbling together a solution using rsync.

- Scrub on a regular schedule. Set up monitoring software to notify you of the outcome of each scrub and of any SMART errors.

Thank you. I need to start small, otherwise I feel overwhelmed by too many moving pieces to keeo in mind and plan for.

So I'm starting small, from powering up a ThinkCentre M910 I had laying around, with an internal disk that can be used to store backups. I have 0 need for performance so my idea was to extend storage with an external USB3 HD enclosure. For now, I don't have the space nor the machine where to install dual hard disks for building a decent RAID. Time will tell.

> That's my case, I'm learning before setting up a cheap home lab and a NAS, and I'm wondering if biting into ZFS is just the best option that I have given today's ecosystem.

ZFS is the simplest stack that you can learn IMHO. But if you want to learn all the moving parts of an operating system for (e.g.) professional development, then more complex may be more useful.

If you want to created a mirrored pair of disks in ZFS, you do: sudo zpool create mydata mirror /dev/sda /dev/sdb

In the old school fashion, you first partition with gdisk, then you use mdadm to create the mirroring, then (optionally) LVM to create volume management, then mkfs.

I dove into ZFS for my home lab as a relative novice.

It's not terrible, but there are a few new concepts to come to grips with. Once you have them down, it's not terrible.

If you don't plan on raiding, IMO, ZFS is overkill. The check-summing is nice, but you can get that from other filesystems.

Maintenance is fairly straight forward. I've even done a disk swap without too much fuss.

The biggest issue I had was setting up raid z on root with ubuntu was a PITA (at the time at least, March of this year). I ended up switching over to debian instead. Once setup, things have been pretty smooth.

Two things I like from it, as per what I've read so far:

* Checksumming

* As you mention, easy maintenance

* Snapshots and how useful they are for backups

In the end what I value is stuff that works reliably, doesn't get in the way, and requiring minimal supervision. And in the particular case of FS, I'd like to adopt a system that helps avoid bitrot in my data.

Could you drop some names that you would consider as good alternatives of ZFS?

For close to ZFS feature parity but much younger, BTRFS.

Otherwise it's sort of figuring out what features you want to drop. XFS and ext4 are probably where I'd look for a single disk hard drive.

Like I said, you could do ZFS, but definitely feels a bit like overkill. Setting up a vdev with one disk just to get snapshots and checksums seems like a lot.

I would still go with a collection of composable tools rather than something monolithic as ZFS, and to avoid the learning curve. But again, for personal use. If you're planning to use ZFS in a professional setting it might be good to experiment with it at home.
As mentioned in the sibling comment, one thing I like is having systems that don't require me to supervise, fix things, etc. In part that's why I've been alwas a user of ext4, it just works.

But I've recently found bitrotin some of my data files and now that I happened to be learning about how to build a NAS, I wanted to make the jump to some FS that helps me with that task.

Could you mention which tools you would use to replace ZFS? Think of checksumming, snapshotting, and to a lesser degree, replication/RAID.

I would argue that a collection of mostly composable tools can easily be much more complex (and bug-prone!) than a single “monolith”. Less moving parts can be good sometimes and I would argue that a file system/volume management is a very compact problem domain where better integration between the tools is more important than extendibility.