Hacker News new | ask | show | jobs
by amluto 198 days ago
> We will evaluate these functions on 3 points:

I’m disappointed that they didn’t discuss my favorite feature of BLAKE3: it’s a tree hash. If you have a file and the BLAKE3 hash of that file, you can generate a proof that a portion of the file is correct. And you can take a file, split it into pieces (of known length and offset), hash them as you receive them, and then assemble them into the full file and efficiently calculate the full file’s hash. The other options cannot do this, although you could certainly build this on top of them.

Imagine how much better S3 would be if it used BLAKE3 instead of MD5. (Hah, S3 and its competitors don’t even manage to fully support MD5 for multipart uploads, although they could support BLAKE3 very well with multipart uploads!)

2 comments

> If you have a file and the BLAKE3 hash of that file, you can generate a proof that a portion of the file is correct

This seems wrong to me? I would expect you could only verify the entire file.

This is a difference between BLAKE3 and most other hash functions. In the usual arrangement ("Merkle–Damgård"), each block depends on the previous one, so the only way to verify some "slice" of the input is to re-hash the whole thing. But when you arrange the input into a tree shape (a "Merkle tree") instead, suddenly the right half of the tree does not depend on the left half until the very last step at the very top. If you give me the input to that last step, I can verify that it matches the root hash that I know, now I have the hashes ("chaining values") I'd need to verify either the left half or the right half without the other. Then I do our favorite trick in computer science, which is to recursively apply that same procedure all the way down, until I have an efficient "path" to whatever part of the tree I actually care about.

For more on this see Section 6.4 of our paper: https://docs.google.com/viewer?url=https://github.com/BLAKE3...

And the Bao repo: https://github.com/oconnor663/bao

Yeah, I was objecting to this part:

> If you have a file and the BLAKE3 hash of that file

To me that means the final hash. If you have the full tree of hashes that is a different story!

PS. Thanks for making BLAKE3! I use it in several pieces of software.

Isn't the hash of multipart uploads just the hash of all the hashes of each part? I have actually replicated the multipart hash locally.
Something vaguely along those lines, and not the same thing on Google vs AWS.

But this isn’t the desired behavior! If you upload the same logical bytes as a single part or as multiple parts, you should get the same thing, and BLAKE3 could do this.