|
|
|
|
|
by kafrofrite
1511 days ago
|
|
There's some theoretical background on how collision functions work. Generally, the assumption is that the hashing functions in cryptography meet the following:
- Each input can be quickly converted to a digest (hash).
- Getting from a digest to the input should be infeasible.
- It should be infeasible to produce two inputs such that they map to the same digest (although strictly mathematically, people tend to prove it is inefficient to do so).
and a few more assumptions that are irrelevant for your question. In a nutshell, the idea is that hashing functions are deterministic. One input maps to one output and by knowing the output I cannot find the input nor I can produce two inputs that map to the same output. Now, those attacks basically violate the principle that one input maps to one output. Suddenly, many inputs map to the same output. What's the risk though?
You want to download a Linux Distro with hash digest of value ABC. I'm somewhat interested in people using that linux distro so I tamper the distro to introduce a tracking mechanism that also produces ABC when hashed. You download the distro, verify that the hash is indeed ABC and install it. A more structured answer to your question is available here https://github.com/corkami/collisions (he also put together a presentation that I can't find right now).
Hope this helps :) |
|
> Now, those attacks basically violate the principle that one input maps to one output
That’s not really… the principal. That’s an impossible desire. You simply cannot fit a larger set in a smaller set. Any function that transforms an arbitrary set of data into a smaller set of data is definitionally going to have collisions.
A flawless hashing function that transforms an arbitrary amount of data into a fixed amount of data is subject to an infinite number of collisions for every given output.
The problem is not that there are collisions, it’s mathematically impossible for there not to be. The problem is the speed at which they can be found. If you can find them quickly enough, the potential exists in finding malicious inputs in the mix.