Hacker News new | ask | show | jobs
by redundantly 1176 days ago
The DDT (deduplication table) used to require ~1GB RAM for every 1TB data written over the life of the file system. Deleting data from the file system wouldn't remove the dedupe references, you'd have to recreate the pool entirely and start over with a new DDT.

However, there are now special devices that can be used stored to store DDT. Typically this is done with two SSDs configured as a mirrored vdev for the DDT metadata. This reduces the overhead on memory, but does cost some performance and still has the same limitation that the DDT size can only be reduced by re-creating the pool.

1 comments

Yes this is what I do, and I wouldn't have dedupe for all datasets, for instance in this case he wanted dedupe on his games, so I would just enable it for that dataset.

But I do run disk-wide compression with no problems and have done so on all my datasets for many years now, and it's been a tremendous space saver. Especially on machines where I have a lot of VMs/containers, it's not unusual for me to have a compression ratio of 2 on these with good old lz4, it will be interesting to see what damage zstd will do once I start experimenting with that.

+1 for compression! Unless you have very specific performance needs there's no reason to not to use compression. In fact TrueNAS (previously known as FreeNAS) has it enabled by default.

I've yet to compare lz4 to zstd myself, but I've read great things about it.