| HN Mirror

> if an attacker can tamper with the data they can probably just create a new hash at the same time

That's true for ordinary databases, but this was developed for a blockchain and uses a Merkle hash tree.

An attacker can only tamper with the data and create a new hash for a data item by also creating a new hash for every node up to the root of the tree. In a blockchain context, even that isn't enough, they'd have to modify the blockchain nodes as well, as I presume they periodically record tree root hashes.

The hash tree gives it some other interesting features too. O(n) diff time, where n is the number of changes output in the diff, is probably due to having a hash tree.

The fast diff would also work with a non-cryptographic hash, but it would be considered not quite reliable enough against occasional, random errors. With a cryptographic hash, for non-security purposes we treat the values as reliably unique for each input. For example, see Git which depends on this property.