Hacker News new | ask | show | jobs
by yohanatan 4634 days ago
In my case, the 'universal description language' is one that supports a single operation, let's say CC for 'carbon-copy', and some other number of operations (contrived to have a quite clumsy/verbose interface). The CC operation takes one input--the 'text' to output (which in my case would be T0 and J1). The 'other operations' would be only the minimal set required to still be Turing-complete (and once again would be devised such that no sane person would want to use them for any sufficiently complicated task). So, M is the implementation of the above UDL.

Then, the question becomes, how much logic do we have to put into M itself to achieve the above? I posit that it would not be very much in fact (but that should be demonstrable if someone were sufficiently motivated).

I think KC requires that M be constant across the texts you wish to compare (so that texts across various languages are normalized to an underlying architecture).

If one views M in this light (as an attempt to level the playing field across competing hardwares/algorithms/languages), then we could see it as the concatenation of the following:

1 - the specification of the raw hardware. 2 - the microcode for the hardware. 3 - BIOS, drivers, HAL, OS, etc. 4 - language implementation/runtime/shared libraries etc.

Now, to compare any two real-world implementations, we need to port them both to the language implemented in #4, for running on the hardware specified by #1-3 and sum their length with M's length.

And if we want to compare two algorithms from different languages (but running on the same 'machine'), we can shift the codes in #4 into the 'texts' of the programs themselves (so as to keep M constant between them). We can do this all the way down to the minimum M required to be Turing complete (and nearly all the way 'up' to the programs under test); i.e., the division between M and a particular text, L0, is somewhat variable.

1 comments

So you are constructing a particular universal language that is designed to make any compression algorithm written in it so long that the concatenation of the algorithm and its own output would always be longer than its input.

First of all, I doubt that this is even possible, because the input, i.e L1 and T0 is arbitrarily long in the general case, while the algorithm's length must be determined before it sees any input, i.e it is constant. That's related to the invariance theorem: http://en.wikipedia.org/wiki/Kolmogorov_complexity#Invarianc...

But even if it were possible or if we agree to arbitrarily limit the size of the input for practical purposes, you wouldn't have achieved your original goal, which was to show that a verbose language can never be as expressive as a terse language.

Your universal language is neither a Turing machine nor any other generally accepted universal language, so no one would agree to accept this particular language as a test of Kolmogorov complexity.

If you can disprove the invariance theorem you may have come up with a valid criticism of Kolmogorov complexity itself, but you were the one who initially based his argument about the expressiveness of programming languages on Kolmogorov complexity. If you invalidate Kolmogorov complexity you invalidate your own argument as well.

In any event, this has been a fun thought experiment.

Yes, fun thought experiment indeed.

I think I am suggesting to construct a particular universal language designed to make any compression of some reasonable T0 and J1 written in that UL (e.g., the source code of Microsoft Windows written in F# and C respectively) longer than T0 and J1 themselves, hence leaving them to be their own Kolmogorov limit (if you will). This seems entirely doable to me within the confines of the 5-tuples afforded me by the formal definition of a Turing machine (although I can see how others, including you, might be less convinced). :-) Maybe I, or someone else sufficiently motivated, will attempt it someday.