Hacker News new | ask | show | jobs
by njohnson41 3682 days ago
Right, the amount of entropy per bit of sequence is always between 0 (deterministic) and 1 (every bit is independent and 50/50) (... or between 0 and log2(k) in general if the element varies over a set of k things). These "weak" sources just have low entropy per bit. They could be biased (more 0s than 1s) or correlated (long runs of 0s/1s or periodicity), or just have some other pattern that sometimes holds.

A deterministic PRNG's sequence has exactly the entropy of it's seed, actually, but it has 0 bits of entropy per symbol, because its sequence is infinite.

The thing most people get confused about with entropy is in thinking that entropy is a property of some single object, like a bit string. Really, entropy is always a measurement about a probability distribution, just like mean or variance is. In the usual case with random streams, the distribution is P(x_i | x_i-1 ... x_0) for bits x_i in the stream, i.e. the distribution remaining for the current bit even if we know all previous bits. For a deterministic PRNG, once we can extract the key from the history (given unlimited compute power) that distribution becomes deterministic, so the entropy is 0.

1 comments

The entropy of a single object is a meaningful concept. It is usually called Kolmogorov complexity.
Kolmogorov complexity is definitely meaningful, but it's not (Shannon) entropy, just conceptually similar. Many people think of something like Kolmogorov-complex sequences when they think of "random" sequences, which is (IMO) why they have trouble thinking of entropy as being about a probability distribution.

The one case where they coincide (sort of) is if you believe your random sequence is generated by a randomly chosen Turing machine, which I've only really seen in philosophical settings.

A uniformly chosen 64-bit integer still has exactly 64 bits of entropy, regardless of how much Kolmogorov complexity the actual bits you generate have.