Hacker News new | ask | show | jobs
by kalleboo 1871 days ago
Right, 2 drives. Not "a new drive". Now you're buying twice as many disks as you would with a Synology and wasting 50% of your capacity on parity. And you better have set up your initial array with 2 drive vdevs as well or you're going to have a sub-optimal experience.

This is the attitude I see a lot in ZFS support forums. "I don't see the problem, just buy twice as many drives!"

1 comments

> This is the attitude I see a lot in ZFS support forums. "I don't see the problem, just buy twice as many drives!"

This is incorrect on several levels.

You most certainly can create a vdev with a single drive in it and add it to the zfs pool. So go ahead, buy that single 10TB drive and add it to your pool.

That's not a wise thing to do though, so I don't understand why you'd want to. You'll have no redundancy at all, as soon as the drive dies everything is lost. Which pretty much completely defeats the point of having a NAS. So don't do that. But if you really want to, you can.

> I don't understand why you'd want to

I want to add a single drive since I can't afford more than a single drive. But I still want to keep the data security of one or more parity drives. Synology lets me do that. ZFS doesn't.

On a Synology NAS (which just uses Linux mdraid underneath the hood so this part isn't exactly some proprietary magic) if you have an array with parity (the equivalent of raid-z/z2), you can add a drive, and it expands the array with that one drive, keeping the parity and recalculating it for the new configuration of drives.

So I can go from an array of 3 x 10 TB disks where one is parity (20 TB usable storage), and then just pop in one more disk and now I have an array with 4 x 10 TB disks (30 TB usable storage) with the same one-disk parity. I can lose any one disk, and lose no data.

ZFS can't do that, since it does't support modifying vdevs. So if I want to be able to add a single drive and expand my storage at any time while keeping the same level of redundancy, ZFS makes no sense.

Synology's configuration of mdraid+BTRFS makes way more sense than ZFS. Unfortunately they haven't contributed it to free software so nobody else can have it (specifically the part of passing through the parity data so that checksum errors in BTRFS can be fixed with mdraid knowledge). I would prefer to not have to rely on Synology's cost-cutting hardware and raft of probably not very secure software. But for the use case of me and the small businesses I support, ZFS has been a non-starter due to the costs.

> So I can go from an array of 3 x 10 TB disks where one is parity (20 TB usable storage), and then just pop in one more disk and now I have an array with 4 x 10 TB disks (30 TB usable storage) with the same one-disk parity. I can lose any one disk, and lose no data

Based on those numbers and https://www.synology.com/en-us/support/RAID_calculator I'm guessing you're using RAID-5?

RAID-5 is fragile. You can lose only one disk as you say, but the odds of succesful rebuild are not so great (assuming you have a NAS for data reliability in the first place).

https://www.digistor.com.au/the-latest/Whether-RAID-5-is-sti...

> expand my storage at any time while keeping the same level of redundancy

But you don't keep the same level of redundancy when adding a drive. The more drives you add in RAID-5, the lower your probability of a successful rebuild after the loss of one drive.

It was just an example with easy to reason about numbers. You could do the same thing with 2-disk redundancy.

> https://www.digistor.com.au/the-latest/Whether-RAID-5-is-sti...

I've seen a lot of articles and blog posts like this, but their numbers never seem to make sense. It says that reading through a 4-disk 8 TB array you only have a 15% chance of success. I have full-array BTRFS scrubbing scheduled monthly, according to this my array should have reported errors many times a year...

And of course, no matter what, no form of RAID/ZFS is a backup.

But doesn’t that come back to their point? With syno you pop in a new disk and it rebuilds the array with the new disk and you have more space and the same redundancy? Raid 5/6 whichever
With btrfs, you can add one or however many new devices you want to a storage pool, then rebalance to ensure redundancy across the whole pool. If the device you add is already btrfs formatted, its contents get added to the storage pool, rather than requiring a reformat.

It really surprises me that zfs apparently cannot do this.

The main reason I use btrfs is the flexibility. Subvolumes instead of partitions, and easy expandability. Storage should be dynamic, not static.

> It really surprises me that zfs apparently cannot do this.

Likewise. I really want to like ZFS, but with the 'buy twice the drives or risk your data' approach as above really deters me as a home user.

ZFS has been working on developing raidz expansion for a while now at https://github.com/openzfs/zfs/pull/8853 but I feel that it's a one-man task with no support from the overall project due to that prevailing attitude.

BTRFS is becoming more appealing, even though it has rough edges around RAID write holes that really isn't a big deal, and reporting of free space. I can see my home storage array going to BTRFS in the near future.

> The main reason I use btrfs is the flexibility. I agree, and I as a small home user, I really like the RAID using different sized disks. E.g. running a raid 1 on three disks: 2TB+4TB+6TB. It also offers the possibility to increase the storage size over time when drives fail by replacing them with a larger disk.
They have dRAID, but last I checked RAID 5/6 is basically asking for data loss with modern drive sizes.
> last I checked RAID 5/6 is basically asking for data loss with modern drive sizes

This is a debate I would love to see with people who have experience. Since I've seen individuals speak with authority on both sides.

I get that if you have a basic array of disks humming along with a big-ass ext4 partition, once one drive dies, the risk of the other drives being riddled with errors is huge.

But what if your array is both (1) using ZFS or BTRFS (with data checksumming) and (2) has scheduled full-disk data scrubs once a month or so? Wouldn't you catch the initial recoverable errors quick enough?

> Wouldn't you catch the initial recoverable errors quick enough?

Not always no.

I've had drives reporting failures for months that zfs scrub keeps fixing, tons of time to get a spare.

But drives also fail suddenly with no history of zfs or SMART errors.