Hacker News new | ask | show | jobs
by Elhana 3694 days ago
As I understand it in btrfs case there is two problems: 1) Metadata in btrfs can use lots of space, especially when you convert from ext4. It might happen that you have gigabytes of free space reserved for metadata, so it can not be used anymore. This can be solved with rebalancing, but that can take ages, which is actually one of the reasons zfs doesn't have bpr rewrite feature. 2) btrfs can have mixed raid levels and in that scenario calculating free space is tricky, but people still rely on common tools, that simply give some estimates in that case. Change the way it estimates free space and you'll have less clueless people complaining about fs running out of it, but more will say btrfs shows too little.
1 comments

FWIW, Hammer on DragonflyBSD can rebalance and dedup with little memory and doesn't take long, but details matter and the comparison may not be fair. What's rebalance in ZFS might be something much more trivial and less effective in Hammer, but I've deduped Hammer filesystems on machines with little memory compared to what ZFS requires for its data structures in memory.
ZFS' data desuplication requires very little memory. However, it will check every new record write under it with every other record write. The only way to do this in a performant way is to lean on cache. Without sufficient cache, you degrade to performing 3 random sequential IOs, which peforms terribly. The system will continue to run, but it would be slow.

As far as I know, there is no way to implement online deduplication with constant RAM usage without performing poorly as things scale or playing Schrödinger's cat with whether data that should deduplicate is subject to deduplication. Offline data deduplication might work, but it would be performance crippling ZFS' data integrity guarentees.

If HAMMER has online data deduplication that is performant with constant ram, they likely made a sacrifice elsewhere to get it. My guess is that it misses cases, such that while you would expect unique records to be written once, they can be written multiple times.