|
|
|
|
|
by bloaf
2169 days ago
|
|
I agree. I've seen some marvelous pigeon hole explanations of why you cannot write a general lossless compression algorithm that makes all files smaller. I have no idea what that explanation would look like in terms of maxes and averages, and I have to believe it would be much more difficult to understand. |
|
Compressing (n+1)-bit input to n-bit output makes the average output correspond to 2^(n+1)/2^n = 2 inputs. By Dijkstra's rephrasing of the pigeonhole principle, the maximum number of inputs colliding on the same output must be at least 2.
Or we could make it more robust and say we're compressing to n-bit-or-smaller output. Then the average output corresponds to 2^(n+1)/(2^(n+1)-1) inputs. That average is still strictly greater than 1, so the maximum number of inputs colliding on the same output must also be strictly greater than 1 (at least 2, since that's the smallest natural which is greater than 1).