|
|
|
|
|
by viraptor
902 days ago
|
|
> And it doesn't matter if all of Wikipedia is embedded in the compressor That's not what I said. I said you can embed a better result. As in, make an algorithm that performs ok in general case, but bake in a super-optimised result that specifically applies to Wikipedia, even if it takes months to precompute. > Why is a time limit needed? Would you like to judge my compressor against other ones? It will take 2000 years to get the result, but it's totally worth it, I promise. It's a random search of optimal binaries that produce the Wikipedia and it will stop once I'm over the winning threshold. |
|
I still tend to consider the needed complexity to produce that binary as irrelevant, because the challenge is not interested in practical compression, only in space complexity and the relation between information entropy and intelligence, no?
If such a highly-specialized binary could be theoritically produced for other corpuses of the same size as well, that would be very interesting! Regardless of how practical their computation is.
And if it's not possible for othe corpuses, the complexity has to be somehow embedded in the decompressor, not the compressor, or am I missing something? Only alternative would be that the compression exploits some data inherent to the machine architecture that is not part of the executable. Still, this only shifts the problem around.
The idea reminds of an optimized ordering of Gödel numbers.
The pidgeonhole principle limits what a general lossless compressor can do, but we don't know how the minimum size of a compressed file scales with the output size, because this problem is related to the Halting problem. Even if we assume infinite compute, we can't know the exact limits for calculating a function |c(n)| that specificies the minimum length of a program that computes an arbitrary output of size |n|. If I'm not totally ignorant and lacking information here.