Hacker News new | ask | show | jobs
by pabs3 25 days ago
Build reproducibility checks usually use bitwise comparison, not hash comparison.

The Reproducible Builds project also wrote diffoscope, which goes quite far with helping identify where differences occur and how to fix them.

https://reproducible-builds.org/ https://diffoscope.org/ https://try.diffoscope.org/

1 comments

Let’s say, for the positive case, hash comparison is significantly faster.
I feel like that is quite unlikely. Both the hash and bitwise comparisons read both files in both cases. In the not-equal case the hash reads the entirety of both files, so its slower than a start-to-end bitwise comparison, which exits at the first not-equal bit. In the equal case, both read the entirety of both files. Various other bitwise strategies can be faster than start-to-end, rdfind for example checks the start of the file first, then the end, then the rest of the file.
I think we’re not talking about the same scenario. I’m talking about the case where at least one hash has already been calculated.