Well not necessarily true for small inputs. Hash a single byte for example. There is no collision in sha1 for that so you can build a 1:1 mapping of hashes back to input examples for that case.
But yeah, as the input size approaches the output size, the probability of a collision existing gets to 1. The birthday paradox formula will give the probability of a collision (assumes random placement in output space) based on number of inputs.
> Well not necessarily true for small inputs. Hash a single byte for example. There is no collision in sha1 for that so you can build a 1:1 mapping of hashes back to input examples for that case.
I suppose you are saying that if you know that the input size is sufficiently small, you don't have to worry about collisions, which is true. I was interpreting "for small inputs" to mean that if you give a small input to a hash function (which can take inputs much larger than the space of the output), that you can still reconstruct the small inputs uniquely. Unless the hash function is deliberately designed to provide unique 1:1 mappings for small inputs, I would think that it's not true that you can uniquely reconstruct small inputs because they will likely map to the same value as a large input would (i.e. 'a' might hash to the same value as some 14821-byte string).
But yeah, as the input size approaches the output size, the probability of a collision existing gets to 1. The birthday paradox formula will give the probability of a collision (assumes random placement in output space) based on number of inputs.