Hacker News new | ask | show | jobs
by linuxready 3778 days ago
I'm sorry I am not sure I get it. Let's say you have a 1000 kB file which is duplicated and which is located on continuous blocks (so if the CDs used 2 kB boundaries, we'll have 500 continuous blocks). If ZFS use 128 kB block size, it will detect 7 blocks (896 kB) that it can deduplicate. So we only lose about 10%.

Perhaps there is a high degree of fragmentation then and files are not on continous blocks ?

(this example would be the same if instead of 2 exactly duplicated files, we have a big common chunk between 2 files)

1 comments

Wrong. If the alignment is wrong, you'll likely lose 100%. 2 kB can be aligned in 64 different positions within 128 kB.

That 1000 kB of 2 kB continuous blocks must start exactly at same mod 128 kB alignment. There are 64 different possible alignments.

Oh that's it then ! Thanks for the clarification.