Hacker News new | ask | show | jobs
by click170 3384 days ago
> The recent attack uses special techniques to exploit weaknesses in the SHA-1 algorithm that find a collision in much less time. These techniques leave a pattern in the bytes which can be detected when computing the SHA-1 of either half of a colliding pair.

> GitHub.com now performs this detection for each SHA-1 it computes, and aborts the operation if there is evidence that the object is half of a colliding pair.

Isn't it possible for a valid non-colliding object or commit to contain that pattern as well? It sounds like eventually, though possibly in the far distant future, someone will be unable to push a commit to Github because it matches the pattern but doesn't contain colliding objects.

Does anyone know what the pattern is they're looking for? I'm curious now.

3 comments

It is possible; the researchers estimate the likelihood of a false positive at 2^-90 (which puts us back in "Sun engulfs the Earth" territory).

There are metrics that will alert GitHub's infrastructure team if a collision is found (to confirm that we aren't seeing any false positives). Those metrics were quietly shipped (without the matching "die") for a week before flipping the final switch.

If you want to know more about the patterns, see the sha1collisiondetection project:

https://github.com/cr-marcstevens/sha1collisiondetection

There's a research paper linked in the README.

They say that the chance of a false positive is less than 2^-90.
Would it be possible to exploit the found bit-pattern in a "false positive" (that wasn't intentionally constructed to collide) to construct a second object with that preimage? Is a preimage attack in that situation more feasible?
Per commit, or over the expected lifetime of Git's utilization of sha-1? If it's the former it's not very useful unless we have an idea how many commits are made.
You should expect to see a hash collision with probability >½ with about sqrt(2^n) hashed objects for a hash of length n. I believe hashes are made per changed file per commit.

A large project (like Firefox) might make a few hundred commits per day, or tens of thousand per year. So that comes out to 2^20-2^30 hashes per year. I don't know how many distinct repositories GitHub has, but I doubt it's larger than 2^30. So that means that GitHub has no more than 2^60 SHA-1 hashes related to git, and probably more like 2^40-2^45.

So the probability of a collision is <<2^-30 (the collision function is logistic, so assuming linearity between 1 and 2^90 is a wildly overoptimistic assumption) in the optimistic case, probably something like <<2^50. In perspective, it's more likely that you will win the lottery tomorrow but fail to discover so because you were killed by lightning than it is that GitHub will detect a chance SHA-1 collision.

Wait, are we talking about chance SHA-1 collisions, or a false positive for this test for this vulnerability?
It's not logistic, that's a specific kind of function.
There are ~3.1e9 seconds in a century, and ~7.5e9 people on Earth. So if every human makes a git commit every second for a century, the odds of a single false positive is ~5.3e-7 (i.e. less than one in a million).
if every human makes a git commit every second for a century

Will someone please write a dystopian novel around this premise? It almost writes itself. GitHub turns evil, forcing the world population into subservience from birth; using Git is all anyone is ever allowed to do, and is how people conceptualize the universe and access entertainment; various cults form with the belief that randomness can be influenced via the right ceremony...

You can source inspiration from https://twitter.com/DystopianYA

I was recently thinking of another idea where humanity works for Google and everyone's job is to solve recaptchas. Everything is free for people who earn enough credits by solving the puzzles.
Isn't this an episode of black mirror?
Sounds like the plot to For-Profit Online University[1]. "I'm a digital gardener."

[1] https://www.youtube.com/watch?v=XQLdhVpLBVE

If it is not a movie yet.. you should definitely write one :D
> various cults form

...each believing in a Supreme branching strategy

And your status in society is determined by the number of consecutive digits of the hash of your very first git commit.
Alternative premise, tech interviews continue to become increasingly competitive, and if you want to be a true dev hirable by Google, etc. this is the output you need.
I could even see a matrix like aspect where those who are unplugged do so by creating hash collisions to hide what they are really doing.
You're right, I suck at probabilities and did it wrong: (3.1e9*7.5e9)/2^90 .
Well, I'd lay a fair amount of money that the number of commits is less than 10^20...
It's also possible for a SHA-1 collision to result from random chance, however it's extremely extremely unlikely to the point where it will almost certainly never happen.
This is true of any hash though. SHA-1 is broken because it is known how to create a collision intentionally.
True of any good hash, at least.

So, to the question "Isn't it possible for a valid non-colliding object or commit to contain that pattern as well", the answer is "No, not really."