Hacker News new | ask | show | jobs
by arketyp 810 days ago
Richard von Mises (brother of the economist) formulated a definition of randomness as a sequence of data that, were you a gambler, you cannot by any strategy make money on betting on the outcomes. This was before computational calculus and was later developed by Kolmogorov and others in algorithmic complexity. The modern variation would be (Wiki) "considering a finite sequence random (with respect to a class of computing systems) if any program that can generate the sequence is at least as long as the sequence itself".
4 comments

> you cannot by any strategy make money on betting on the outcomes

What does "strategy" mean here? I might just happen to have a strategy which involves betting on the exact sequence of heads and tails in a given sequence. The analogy in terms of languages is that my language might just happen to have a short keyword that represents a given sequence of heads and tails.

I don't know much about Kolmogorow complexity so I'm certainly missing something here. Potentially there is a subtle clause in the technical definition that doesn't make it through to these articles.

> What does "strategy" mean here? I might just happen to have a strategy which involves betting on the exact sequence of heads and tails in a given sequence.

That's a very narrow program.

> The analogy in terms of languages is that my language might just happen to have a short keyword that represents a given sequence of heads and tails.

The sequence still needs to be generated "somehow". Either by executing the program and producing the sequence, or by explicitly stating it. Even if you have it "cached" and "represented" in your language, you still need to generate the sequence. The resources spent here is the Kolmogorov complexity.

The easiest way to expand your little program is to say that you have a seed s.t. any consecutive generation results in a consecutive sequence that matches up to the period of the generator. Now it is more generic, but has a period. You can then expand this to accept multiple seeds and once it has reached a period, to simply take the next seed.

Should this sequence be finite, you are in luck. Your program can have length O(generator + N/P) where N is length of sequence, and P is the period of your RNG.

All this is is just compression which plays into the whole Kolmogorov complexity.

The idea is that you bet before the sequence is known. Nowadays we would say it is the distribution (or the process producing the random sequences) that can be truly random or not, and we recognize that saying "sequence [...] is random" is incoherent, same as the joke of the random int set to 4 with a comment in the source code that it was chosen by fair dice roll.

If you know everything about the process and still can't beat chance at predicting it, that's the quality we are after. In this definition "random" just means unpredictable, which is another way to explain why it can only be a meaningful distinction when you don't yet know the result.

> What does "strategy" mean here?

Any function that outputs bets.

Doesn't the modern variation break for programs with lossless encoding/decoding? At least, for sufficiently long sequences? A Huffman/byte-pair encoding would shred any trillion-bit+ sequence, for instance. But I intuitively expect many random trillion-bit sequences exist.
There is no encoding that would shred "any" (read: every) trillion bit sequence. If that were true, some fundamentals of information theory and compressibility would break down.

Lossless encoding works by taking advantage of the more commonly observed sequences of data having lower information entropy. For things like audio encoding, where discontinuous sequences aren't naturally observed (or pleasing to listen to), lossless encoding has a lot to work with.

For any fixed compression scheme, there is an input string that is actually lengthened by it rather than shortened.

However Huffman isn’t a fixed compression scheme since it makes a different frequency tree for different corpora.

The two definitions say different things. What von Mises said is closer to cryptographic definitions of pseudorandomness, and in particular to next-bit unpredictability.
Yes, I agree. But I talked about an idea development and said variation, not necessarily addressing the same thing. The headline would be algorithmically random sequence.

https://en.wikipedia.org/wiki/Algorithmically_random_sequenc...

Do you have a citation? I didn’t know the idea went back that far.
Thanks, I had to dig. I read about it in [1]. Mises was concerned about the formalization of probability theory. It seems the idea appears at least as early as in his 1919 paper [2].

[1] An Introduction to Kolmogorov Complexity and Its Applications, M. Li & P. Vitnányi

[2] Grundlagen der Wahrscheinlichkeitsrechnung, R. von Mises