Hacker News new | ask | show | jobs
by yohanatan 4635 days ago
Actually, given a Turing Machine M which outputs x when given x, then we would have the description of program T0 (for terse language T) equal to the concatenation of the [specification of the] machine M with T0, M T0. Given program J1 (for verbose language J), its description equals M J1. Given sufficiently small M and large T0, the M factor's importance is quite reduced and we see that Kolmogorov complexity is roughly analogous to the respective sizes of T0 and J1 themselves (for this specific Turing Machine M).
1 comments

You're forgetting that Kolmogorov complexity is the shortest possible description, not just any description of the desired output. The concatenation you're suggesting is highly unlikely to be the shortest possible one.

You have to think of it in terms of compression, because that's what Kolmogorov complexity is really about. It is certainly possible that after compression J1 is shorter than T0.

You're forgetting that I get to pick the Turing machine to use. Kolmogorov complexity is the shortest possible description relative to a particular Turing machine. Re-read my previous post with that in mind.
No you don't get to pick the particular Turing machine if by "particular Turing machine" you mean a Turing machine including a particular transition function, which is the conventional definition of Turing machine (http://en.wikipedia.org/wiki/Turing_machine#Formal_definitio...)

In terms of this definition we need two Turing machines, one whose transition function can be proven to be the shortest that describes L1. And another one whose transition function can be proven to be the shortest that describes T0. These two Turing machines or transition functions are compressed representations of L1 and T0 respectively.

What I'm saying is that the one describing L1 can be shorter than the one describing T0 even if L1 is longer than T0.

I think we are approaching this from different angles. I am thinking of a single contrived M for whom a simple copy input to output is the primary supported operation and all other operations/modes, although still capable of performing arbitrary computation [hence able to execute 'compressed' programs if you should wish to attempt to provide them and also, by definition, still Turing complete], are exceedingly clumsy/verbose.

I have not only the transition function but 6 other entities to define to accomplish this (really 5 barring the 'blank symbol') so I am certain that it is not only possible but likely much smaller than one might think at first. With such a small M (that penalizes attempts to compress T0 and J1), and a large T0 [such as the code base for Microsoft Windows ported to F#] and an even larger J1 [such as the same code base written in C], we still see my desired result.

I understand what you're saying but I fail to see how your M is related to Kolmogorov complexity. You're not free to define some contrived M with particular properties that make your assumption true. Kolmogorov complexity explicitly asks for the shortest possible M for each individual string (i.e L1 and T0). Here's the definition from Wikipedia (http://en.wikipedia.org/wiki/Kolmogorov_complexity):

"More formally, the complexity of a string is the length of the shortest possible description of the string in some fixed universal description language".

M is not that universal language. M is a program written in the universal language of Turing machines. I guess that is our misunderstanding. If you define M itself as that universal language but M only ever does that one contrived thing you said (i.e. there is only one program written in that language) then it is not a universal language any longer and any comparison of lengths would become completely pointless.

In my case, the 'universal description language' is one that supports a single operation, let's say CC for 'carbon-copy', and some other number of operations (contrived to have a quite clumsy/verbose interface). The CC operation takes one input--the 'text' to output (which in my case would be T0 and J1). The 'other operations' would be only the minimal set required to still be Turing-complete (and once again would be devised such that no sane person would want to use them for any sufficiently complicated task). So, M is the implementation of the above UDL.

Then, the question becomes, how much logic do we have to put into M itself to achieve the above? I posit that it would not be very much in fact (but that should be demonstrable if someone were sufficiently motivated).

I think KC requires that M be constant across the texts you wish to compare (so that texts across various languages are normalized to an underlying architecture).

If one views M in this light (as an attempt to level the playing field across competing hardwares/algorithms/languages), then we could see it as the concatenation of the following:

1 - the specification of the raw hardware. 2 - the microcode for the hardware. 3 - BIOS, drivers, HAL, OS, etc. 4 - language implementation/runtime/shared libraries etc.

Now, to compare any two real-world implementations, we need to port them both to the language implemented in #4, for running on the hardware specified by #1-3 and sum their length with M's length.

And if we want to compare two algorithms from different languages (but running on the same 'machine'), we can shift the codes in #4 into the 'texts' of the programs themselves (so as to keep M constant between them). We can do this all the way down to the minimum M required to be Turing complete (and nearly all the way 'up' to the programs under test); i.e., the division between M and a particular text, L0, is somewhat variable.