| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Rochus 2117 days ago
	What is the use case? Why is it important that "All keys, values are backed by blake 256 bit checksum"?

2 comments

jopari 2117 days ago

It seems to be intended as the backend database for the Dero blockchain smart contract platform: https://medium.com/deroproject/graviton-zfs-for-key-value-st...

The post claims: "The features included in Graviton provide the missing functionality that prevented Stargate RC1 from reaching deployment on our mainnet."

I'm not sure, but I guess that this checksumming is relevant for storing the Merkle trees encoding the blockchain. I don't know why the previous choice of database wasn't suitable.

link

naivedevops 2117 days ago

ZFS stores the checksums of files to prevent bit rotting. Since they are comparing their database to ZFS, I guess it stores the checksums for the same reason. If bit rotting occurs, you don't need to discard the entire database, just the affected entry. If the entry was already there for some time, you might even be able to restore it from a backup.

link

Rochus 2117 days ago

I can understand it with a file system; but in a typical key/value store application the data elements are much smaller (likely even smaller than the hash result).

link

GordonS 2117 days ago

Isn't a 256-bit Blake hash a little OTT, versus a simple CRC, or even a faster, smaller hash like MurmurHash or Jenkins-one-at-a-time?

link

jlokier 2117 days ago

It's a cryptographic hash, so it will detect tampering with the data, which a simple CRC, MurmerHash or Jenkins would not.

link

GordonS 2117 days ago

Still, I'd like an option to use a faster, more efficient CRC or hash - bit rot is usually the main threat, rather than tampering. Not to mention that if a user can tamper with the data they can probably just create a new hash at the same time.

Using a cryptographic hash as a souped-up CRC seems rather odd, given how many more CPU cycles and RAM it will use, but I don't know the reasoning behind the decision; there must be one.

link

jlokier 2117 days ago

> if an attacker can tamper with the data they can probably just create a new hash at the same time

That's true for ordinary databases, but this was developed for a blockchain and uses a Merkle hash tree.

An attacker can only tamper with the data and create a new hash for a data item by also creating a new hash for every node up to the root of the tree. In a blockchain context, even that isn't enough, they'd have to modify the blockchain nodes as well, as I presume they periodically record tree root hashes.

The hash tree gives it some other interesting features too. O(n) diff time, where n is the number of changes output in the diff, is probably due to having a hash tree.

The fast diff would also work with a non-cryptographic hash, but it would be considered not quite reliable enough against occasional, random errors. With a cryptographic hash, for non-security purposes we treat the values as reliably unique for each input. For example, see Git which depends on this property.

link

GordonS 2117 days ago

I meant to say "attacker", rather than "user".

link