Hacker News new | ask | show | jobs
by ErikCorry 1234 days ago
This seems like a weak argument.

Firstly SHA is not a secure hash.

Secondly if your build step involves uploading data to a third party then allowing them to transform it as they see fit and then checksumming the result then it's not really a reproducible build. For all you know, Github inserts a virus during the compression of the archive.

What am I missing?

4 comments

> Firstly SHA is not a secure hash.

It's... literally the Secure Hash Algorithm. (Yes, yes, SHA-1 was broken a while back, but SHA and derivatives were absolutely intended to provide secure collision resistance).

I think you're mixing things up here. Github didn't change the SHA-1 commit IDs in the repositories[1]. They changed the compression algorithm used for (and thus the file contents of) "git archive" output. So your tarballs have the same unpacked data but different hashes under all algorithms, secure or not.

> Secondly if your build step involves uploading data to a third party then allowing them to transform it as they see fit and then checksumming the result then it's not really a reproducible build. For all you know, Github inserts a virus during the compression of the archive.

Indeed. So you take and record a SHA-256 of the archive file you are tagging such that no one can feasibly do that!

Again, what's happened here is that the links pointing to generated archive files that projects assumed were immutable turned out not to be. It's got nothing to do with security or cryptography.

[1] Which would be a whole-internet-breaking catastrophe, of course. They didn't do that and never will.

>Firstly SHA is not a secure hash.

This is incorrect, but even if it were true, you could use whatever your hash of choice is instead. Gentoo for example can use whatever hash you like, such as blake2, and the default Gentoo repo captures both the sha512 and blake2 digests in the manifest.

Sha1 is still used for security purposes anyways, even though it really shouldn't be!

Signing git commits still relies on sha1 for security purposes, which I think many people don't realize.

Commit signing only signs the commit object itself, other objects such as the trees, blobs and tags are not involved directly in the signature. The commit object contains sha1 hashes to it's parents, and to a root tree. Since trees contain hashes of all of their items, it creates a recursive chain of hashes of the entire contents of the repo during that point in time!

So signed commits rely entirely on the security of sha1 for now!

You may have already knew all of this about git signing but I thought it might be interesting to mention.

1) SHA-256 is reasonably secure

2) The checksum assures you that the file you have is the same your upstream looked at

1) Ah of course, this is SHA256, my mistake.

2) If I and the upstream are both looking at a file that was generated by Github then the Sha may match, but that doesn't prove we weren't both owned by Github.

Perhaps what I am missing is that this isn't part of a reproducible build scenario. There's no attempt to ensure that the file Github had built is the one I would build with the same starting point.

If you trust your upstream, then the checksum is enough. If you don't trust your upstream, its sort of an RCE anyways.
I think the reproducible build part is about projects that depend on these outputs. The goal is ensuring you and I have both pulled exactly the same dependencies.