Hacker News new | ask | show | jobs
by jlokier 2117 days ago
It's a cryptographic hash, so it will detect tampering with the data, which a simple CRC, MurmerHash or Jenkins would not.
1 comments

Still, I'd like an option to use a faster, more efficient CRC or hash - bit rot is usually the main threat, rather than tampering. Not to mention that if a user can tamper with the data they can probably just create a new hash at the same time.

Using a cryptographic hash as a souped-up CRC seems rather odd, given how many more CPU cycles and RAM it will use, but I don't know the reasoning behind the decision; there must be one.

> if an attacker can tamper with the data they can probably just create a new hash at the same time

That's true for ordinary databases, but this was developed for a blockchain and uses a Merkle hash tree.

An attacker can only tamper with the data and create a new hash for a data item by also creating a new hash for every node up to the root of the tree. In a blockchain context, even that isn't enough, they'd have to modify the blockchain nodes as well, as I presume they periodically record tree root hashes.

The hash tree gives it some other interesting features too. O(n) diff time, where n is the number of changes output in the diff, is probably due to having a hash tree.

The fast diff would also work with a non-cryptographic hash, but it would be considered not quite reliable enough against occasional, random errors. With a cryptographic hash, for non-security purposes we treat the values as reliably unique for each input. For example, see Git which depends on this property.

I meant to say "attacker", rather than "user".