|
|
|
|
|
by vardump
3778 days ago
|
|
The files on CDs are aligned to 2 kB boundaries. Dedup is looking for n kB continuous block. If the block size of the material you want to dedup does not match to the block size of dedup system, you'll get suboptimal results. The bigger the difference, the worse the results. Say you have this data: ABCABCBACCBBABCC
Dedup system that has block size of 1 can see you really have just three unique blocks, A, B and C.Same data, but dedup with block size of 2: AB CA BC BA CC BB AB CC
Dedup block size of 2 thinks you have 6 unique blocks: AB, CA, BC, BA, CC and BB.Etc. |
|
Perhaps there is a high degree of fragmentation then and files are not on continous blocks ?
(this example would be the same if instead of 2 exactly duplicated files, we have a big common chunk between 2 files)