|
|
|
|
|
by londons_explore
592 days ago
|
|
> offset the weight of all the unique entries in your dedup table Didn't read the 7000 words... But isn't the dedup table in the form of a bunch of bloom filters so the whole dedup table can be stored with ~1 bit per block? When you know there is likely a duplicate, you can create a table of blocks where there is a likely duplicate, and find all the duplicates in a single scan later. That saves having massive amounts of accounting overhead storing any per-block metadata. |
|