Hacker News new | ask | show | jobs
by pdkl95 2558 days ago

    The following stack will be used as reference, with users
    connecting via web, desktop and smartphone clients:

        Client: Riot-web v1.2.1,
                Riot Desktop v1.2.1,
                Riot Android v0.9.1

        Server: Synapse v1.0.0
Version numbers are probably sufficient to in a general scientific setting. They are usually a precise reference to a specific piece of software anyone attempting to replicate the investigation should be able to find their own copy of the software and have reasonable confidence their copy is identical.

Unfortunately, it might not be a good idea to trusting that a version number consistently maps to a specific URL, or that a server will give the same file to everyone each time they ask fo a URL. We know that sending different versions to different people is common ("A/B testing"). If you're investigating the security of something or worse: you suspect you might have sentient opponents actively trying to deceive you, then version numbers are no longer sufficient: you should also include cryptographic checksums! The only way you can know that the file you received is the same is if you have e.g. SHA-2 hashes as proof. Even better, if it's important, include the RIPEMD-160, SHA-1, CRC32, and any other available hash/checksum because why not add redundancy and give people options.

1 comments

Totally fair point, thank you for bringing it up. Given the numerous build types (source, pip, debian packages, etc), what would you suggest to do in this case? Give the git commit hash maybe?
A traditional approach is to attach a checksum file with all of the relevant packages, in the usual ${hash}sum output format (hashes truncated for HN-page readability):

    $ sha256sum *
    e406bcc...51c199a  riot-android-0.9.1.tar.gz
    8020cc6...d6126c1  riot-v1.2.1.tar.gz
    443b612...51e0cef  synapse-1.0.0.tar.gz
> Given the numerous build types (source, pip, debian packages, etc)

In the interest of making a reproducible investigation, it might be a good idea to include hashes for the specific packages being investigates.

> Give the git commit hash maybe?

That would probably work? This gets into the problem of reproducible builds, where builds from different environments might not be identical. This means documenting that you used "a build of version 1.2.1 git commit 7446799e4b0e3e65122f5642b5f3a8c59aae15bf" means something slightly different than saying you used "riot-v1.2.1.tar.gz with SHA256 8020cc617367a4318be090b1562a26571f1a3417b0d4a52b2d4f19e03d6126c1". That said, obviously having literally any hash to work from is much better than using version numbers alone.

Github links that include the commit hash might be useful, but it seems like you cannot link to both a tag and a hash? I wonder if github supports links that are a combination of https://github.com/vector-im/riot-web/releases/tag/v1.2.1 and https://github.com/vector-im/riot-web/commit/7446799e4b0e3e6... ?

After talking with the other contributors of the doc, we decided on not going into further details.

We acknowledge the need for reproducible investigation, but the document did not explain in a scientific manner how we reached such outcomes. We had to draw a line to keep the document on point with our message. Adding hashes wouldn't really make a significant difference.

We'll make sure to keep this in mind if we do write a follow-up with details on reproducible checks thought. Thank you for your insight!