Hacker News new | ask | show | jobs
by folli 2230 days ago
You need to watch his video, very cool, it really helps to understand how this works: https://www.youtube.com/watch?v=f9EbD6iY9zI
2 comments

There's a few things in there that are factually incorrect -- in particular, the false notion that "every input has a unique output" can be quite dangerous in some cryptographic settings.

That said, the purpose of this talk is about the mechanics of the function, and not its properties or how to use it safely. So don't let that detract from what is, really, an awesome presentation.

I'm sorry, could you please elaborate? I was always under the assumption that hash functions have to be deterministic, and thus, that "every input has a unique output" was a correct statement.

AFAIK the contrary is invalid, so that "not every output is the result of one and only one input".

A function being deterministic means that any input will have a single output. But it is not unique for any hash function, SHA-256 included. The definition of a hash function is any function which takes an arbitrary length input and outputs an n-bit output for some fixed value of n. By virtue of the fact that you have infinite inputs and finite outputs, the outputs cannot be unique.

Generally when people make this claim, what they're actually referring to is what's called Collision Resistance (CR) and/or Weak Collision Resistance (WCR), which instead make claims on difficulty of finding such collisions (of which infinitely many exist).

WCR, necessary for almost any cryptographic use, states that for any given input it should be difficult to find a different input which hashes to the same value. CR, generally desirable for cryptographic hash functions, states that it should be difficult to find two different inputs such that their hashes are equal. CR implies WCR, but WCR does not imply CR -- for example, SHA-256 (currently) exhibits CR but SHA-1 only exhibits WCR.

There are 2^256 potential outputs for SHA-256, while the number of potential inputs is infinite. Therefore, the same output can be generated with different inputs, although finding such "collisions" by chance is extremely unlikely
The claim is not that every output has a unique input, which would not be correct, and seems to be what you are addressing.
at 1:08 in the video, that is exactly what he claims:

"So every piece of data in the world has its own unique hash digest."

This is false for the reasons apeescape describes: every piece of data in the world has its own hash digest, but these hash digests are not unique.

Yes that sentence is technically incorrect, but practically correct. We've never found a collision and though we expect it to be theoretically possible, even common if you consider "all possible inputs" and the pigeonhole principle, for practical purposes hash outputs are unique because nobody considers "all possible inputs" when evaluating probabilities.

I'm saying that for a layman explanation, it's reasonable to say that hash outputs are unique. Because following that with "technically, it's more 'practically' unique, theoretically there are collisions but you won't encounter them with probability > 2^-256" (or whatever it is) just confuses the topic to them more than just summarizing. You have to admit that most people won't go on a 200h adventure to learn about the state space of 256+ bits and how to conceptualize tiny statistical probabilities, so there must be a point where you have to cut the explanation to an approximation of the truth. This is true in every field.

On the other hand, if we can count "every piece of data in the world" then we can estimate the probability of having a collision.
I see what you mean, but it sounds like the output is unique, and we probably agree that in this field you need to use sentences that cannot be easily misinterpreted.
That video is really, really awesome! And it won't leave you feeling "Japanese" either. (Which is a great people, btw. I'd really like to go there someday, mostly for the food and language and history. And Anime also, I'm forced to admit.)