|
|
|
|
|
by andrewaylett
4121 days ago
|
|
It's not that there might be a collision, but that collisions are guaranteed. How many 3Mb files are there? 2^(3M8) = 2^25165824. How many distinct hashes are there? 2^(2561001) = 2^256256. By the pigeonhole principle, we can't fit 2^25165824 objects into 2^256256 holes; indeed each file will have on average 2^24909568 other files that share the same set of 1001 sha256 hashes. The reason that we are able to work with hashes, and that compression works in practice, is that we don't have to deal with every 3M file. Most of these files are gibberish and will never be seen, and finding two files that match even one hash is incredibly difficult. But once we start talking about brute-forcing, we start encountering problems -- and having to dedicate an awful lot of processing power to the problem isn't the biggest one... |
|