Hacker News new | ask | show | jobs
by wuiheerfoj 902 days ago
Running 2^64 SHA1 ops on a GPU takes 15 years, so I think finding a reasonable collision for that half using SHA2/3 is not as trivial as you suggest:

https://crypto.stackexchange.com/questions/84520/how-long-wo...

2 comments

A chosen-prefix (i.e. a demonstration how to modify a given legal document to obtain two different documents that have the same hash) SHA-1 collision has already been computed:

"We have successfully run the computation during two months last summer, using 900 GPUs (Nvidia GTX 1060)."

Such resources can be easily rented from a cloud.

So for anyone willing to spend up to 100k USD, it is trivial to find SHA-1 collisions.

https://sha-mbles.github.io/

Since that post was published, the 4090 came out, which can (according to this hashcat benchmark [1]) do 50,638.7 million SHA1 hashes per second, so now it would only take a single 4090 GPU 11.55 years. Or you could buy 12 of them and do it in a year, etc. So it's definitely not cheap but 15 years is definitely an overestimate (and presumably GPUs will keep getting faster...).

SHA2-256 is "only" 21975.5 MH/s so you'd have to double the number of GPUs or amount of time.

[1] https://gist.github.com/Chick3nman/32e662a5bb63bc4f51b847bb4...

Generating 2^64 hashes isn't guaranteed to produce a collision, and even if a collision did exist in that set, you're not going to find it by getting a bunch of GPUs to compute 2^64 hashes. There's a huge difference between a haystack that maybe contains a needle, and a needle that's been pulled from the haystack and presented to you. To actually find and identify the collisions you'll need to hook those GPUs up to some sort of storage/retrieval system. Just to store 2^64 128-bit hashes would take 295.1 exabytes. That's an order of magnitude more storage than NSA's utah datacenter[1].

[1] https://en.wikipedia.org/wiki/Utah_Data_Center

You can generate pairs of hashes for random inputs and check for collision without storing all of the outputs, no?
True, which is why 256-bit or bigger hashes are recommended to make sure that the time for finding a collision is alone great enough to make this impossible.

Finding a collision for an 160-bit hash by brute force, on a big cloud or supercomputer, is about the maximum that someone with unlimited financial resources could do.

That requires way more hashes to be computed though.
I don't think that's correct? It's probabilistic, yes, but in expectation you would still need ~2^64 hashes to find a collision for a 128-bit hash (birthday paradox).
See my above comment. There's a huge difference between having a collision somewhere in a pile of 2^64 hashes, and actually having the colliding hashes ready to present.