|
|
|
|
|
by radiowave
5406 days ago
|
|
Yup, that's about the same proportion I found when I recently tried copying my data across to a ZFS system with the dedup switched on. I then decided to disable the dedup, because it comes at a cost - the checksum data (which would mostly be living on the SSD read cache I had attached) was occupying about 3 times the monetary worth of SSD storage space than the monetary worth of conventional disk space that the duplicate data was occupying. I noticed that the opendedup site (linked from the article) claims a much lower volume of checksum data, relative to number of files; perhaps an order of magnitude less than I observed with ZFS, but they seem achieve that by using a fixed 128KB block size, which brings along its own waste. (ZFS uses variable block size.) I haven't actually done the numbers here but I wouldn't be at all surprised to find that for my data, the 128KB block size would be costing as much disk space as what dedup was saving me. (YMMV, of course.) |
|
I'm puzzled why people in general aren't more worried about data corruption due to hash collision.....