Hacker News new | ask | show | jobs
by the_duke 3623 days ago
I wonder if they use a rolling checksum too, to avoid duplicating a complete file if only a view bytes shifted (for example adding a line of text in the beginning of a file)

The backup tool bup (https://github.com/bup/bup) does this.

1 comments

They almost certainly do not, mostly because of how slow doing so is.
It probably wouldn't hit the most important cases either, dedup is typically most powerful & valuable on large media files, software packages, disk ISO's, and the like which do not frequently have arbitrary text inserted at the start of the file!