Hacker News new | ask | show | jobs
by Jerrrrrrry 603 days ago
I have the humility to admit that this, despite everything I pretend to know, has always escaped my understanding.

Someone please (jump?) at the chance to explain this one to me.

(assume i failed 9th grade 3 times)

1 comments

The key assumption is that T and H may not have the same probability, but each flip isn't correlated with past or future flips. Therefore, TH and HT have the same probability. So you can think of TH as "A" and HT as "B" then you repeatedly flip twice until you get one of those outcomes. So now your coin outputs A and B with equal probability.
I feel like I am missing something so obvious that I feel the need to correct wiki, but that likely means I am fundamentally missing the point.

"The Von Neumann extractor can be shown to produce a uniform output even if the distribution of input bits is not uniform so long as each bit has the same probability of being "one"->[first] and there is no correlation between successive bits.[7]"

As long as the person doesn't favor which of the two bits they chose is "first", then it should appear as random.

But that is self-defeating, as if the person had the capability to unbiased-ly choose between two binaries, they wouldn't need the coin.

But since the only way to determine the variation from expectation is repeatedly increasing sample size, I don't see how doing it twice, and just taking encoding of the bits, then...

Is the magic in the XOR step? To eliminate the most obvious bias (1v5 coin), until all that could had been left was incidental? Then, always taking the first bit, to avoid the prior/a priori requisite of not having a fair coin/choosing between two options?

and it clicked. Rubber duck debugging, chain of thought, etc.

I will actually feel better now.

>To eliminate the most obvious bias (1v5 coin), until all that could had been left was incidental?

There is only one coin, flipped _twice_; not a running occurrence, but in couples, perfectly simulating two coins functionally.

Once a literal couple of coins result in a XOR'd result eventually, no matter how biased - they differ - the exact ordinality of which will be random.

Two sides to a coin, no matter how random, still half the chance.

(for lurkers cringing at my subtle mis-understanding)

Maybe I don't understand why or what you don't understand but...

Say you have a biased coin. It lands heads 55% of the time (but you don't know that.) Then the probabilities are:

HH = (0.55 * 0.55) = 0.3025

TT = (0.45 * 0.45) = 0.2025

HT = (0.55 * 0.45) = 0.2475

TH = (0.45 * 0.55) = 0.2475

If you disregard the HH and TT results then the equal probabilities of HT and TH result in a perfect binary decider using a biased coin. You assign HT to one result and TH to the other.

Maybe this intuitive "proof" will help.

Coins and dice and datums (solid objects with detectable outcomes) may, or may not have bias, it depends on how they were made and on manufacturing defects that resulted. But, at a minimum, such bias can oftentimes be side-stepped or bypassed.

Consider this argument from Johnny Von Neuman.

Suppose you have a single biased coin with these outcome probabilities:

A) Heads (1) 60% (Call this probability p.)

B) Tails (0) 40% (The probability of this outcome is q=(1-p), by definition.)

Now let us apply this algorithm to sequential tosses for this coin:

1) Toss the coin twice.

2) If you get heads followed by tails, return 1. (Say this outcome occurs with probability p’.)

3) If you get tails followed by heads, return 0. (The probability of this outcome is q’=(1-p’), by definition.)

4) Otherwise, ignore the outcome and go to step 1.

The bit stream that results is devoid of bias. Here’s why. The probabilities of obtaining (0 and 1) or (1 and 0) after two tosses of the coin are the same, namely p(1-p). On the other hand, if (1 and 1) or (0 and 0) are thrown, those outcomes are ignored and the algorithm loops around with probability 1 – 2p(1-p). So, the probability (p’) of getting a 1 using this algorithm after any sequential two tosses of the coin is p’ = p(1-p) + p’(1-2p(1-p)). The solution of which is p’=1/2, and since q’=(1-p’), then q’=1/2. A fair unbiased toss!

In fact, the example bias numbers given above don’t matter for the argument to hold (note that after solving for p’ it is independent of p). The outcome of the algorithm is a fair toss (in terms of the (0 and 1)-bit stream that results), regardless of the actual bias in the coin for a single toss. All the bias does is have an effect on the efficiency with which the bit stream is created, because each time we toss heads-heads or tails-tails we loop around and those two tosses are thrown away (lost). For an unbiased coin the algorithm is 50% efficient, but now has the guarantee of being unbiased. For a biased coin (or simply unknown bias) the algorithm is less than 50% efficient, but now has the guarantee of being unbiased.

This algorithm is trivial to implement for the Satoshi9000.

Thank you so much for explaining it with a concrete example, now even I understand it :)

This really is a useful idea.

  >Maybe I don't understand why or what you don't understand but...
Small mis-step because of an extremely bias head-example (99%H, 1%T).

When imagined, the first result is 99% Heads...until you finally flip a Tails.

We had to do this exact thing in 6th grade, and I picked proving 5%...fml.

I forgot that they are discrete pairs, not continuous (like my head cannon).

The XOR is the magic. Always has been.