Hacker News new | ask | show | jobs
by jbandela1 977 days ago
Von Neumann described a very elegant way to get fair results from a biased coin.

1. Flip the coin twice

2. If you get the same result both times, goto 1

3. Now that you have different results for your pair of flips, use the first element of the pair of flips as your result.

https://en.wikipedia.org/wiki/Fair_coin#Fair_results_from_a_...

24 comments

If anyone wants to test it, someone wrote a short code that simulates doing that 100,000 times:

https://www.techiedelight.com/generate-fair-results-biased-c...

The coin is biased to come up TAILS 80% of the time, but using Von Neumann's method in the program I got HEADS 50.035%, TAILS 49.965%.

Why would you test it?

Probability of two heads: p*p

Probability of two tails: (1-p)*(1-p)

Probability of head followed by tails: p*(1-p)

Probability of tails followed by heads: (1-p)*p

It's not difficult to notice that if you remove the first two, the last two form a 50/50 distribution

> Why would you test it?

I recall conversations on Usenet decades ago about the Monty Hall problem[1] in which people gave elementary proofs that probabilities don't change by opening a door. Even from mathematicians and statisticians. People were very insistent that the analytical solution was simple and obvious and that switching doors didn't change anything.

The only thing that changed some people's minds was a program that simulated the Monty Hall problem. This was needed to get people to reconsider their proof when the claim was highly counterintuitive.

[1] https://en.wikipedia.org/wiki/Monty_Hall_problem

The Monty Hall Problem is a fun one because you can try to approach it from a purely analytical perspective and get one answer, while incorporating the whole situation (especially the fact that the final probability is not natural as they force the final decision into far fewer doors than originally present) and testing you can find a different answer entirely.

I suppose this is an interesting corollary with discoveries made by deep theoretical mathematics. While something may seem possible because "the math checks out" it could be only theoretically possible as it relies on some unnatural value to "be" possible in the first place.

Testing is where hopeful theories are smashed by reality until all that remains is the verifiable truth. Truly, why wouldn't we test?

Fundamentally, the trouble with the Monty Hall problem isn't that analysis comes to the wrong answer, it's that people often come to the wrong model when reasoning about it informally.

It's not any harder to do the "correct" analysis than to write up a simulation. It's mostly just easier to convince yourself that the simulation matches the problem description when it reaches the unintuitive result.

Fundamentally, I think the real trouble with the Monty Hall problem is that the assumptions of the game are not clearly stated. Because of this, people come up with different models.
That's how it went when I was solving problems at the Statistics course at university. I modeled the problem perfectly, got the wrong result. Changed assumptions, got the wrong result. Checked the solution, its reasoning didn't make much sense anyway. Run a simulation, got an approximate result close to the correct solution.
This sounds like the classic "tweak the model until the results fit with our preexisting conclusion". Very common across all industries unfortunately.
The way the problem makes sense to me is this.

Doors: Goat Goat Car

You pick a door. Monty shows you a Goat. You switch or stay.

Monty will never show you the Car before offering a switch. He always shows you a Goat. It doesn't matter which Goat he shows you - it's just "not the Car".

If your first choice is a Goat, switching will win you the Car. If your first choice is a Car, switching will win you a Goat. You have a 2/3 chance of picking a Goat, so, effectively, you want to pick a Goat so that you switch to the Car.

For sufficiently analytical folks that works, but for lay people it tends to still be confusing.

The best way I’ve heard it explained to help people get it through intuition is by changing the number of doors and goats. Say there are 100 doors, and they all have goats except one, which has a car. You pick door 1. Monty then proceeds to open doors 2 through 48, skips door 49, and then opens the remaining doors. After all that, he stops and asks you, would you like to switch?

There's a better way to think about it.

1. You pick a door.

2. You get the offer "Do you want to keep that door, or choose both [all] of the other doors? In either case, you'll keep anything that isn't a goat."

3. Nobody opens any doors.

Should you keep your one door, or switch to the two doors?

I always feel like there is something fundamental missing from the examination of Monty Hall problems.

I think it has to do with the difference between "probable outcome in reality" and "probably outcome based on personally known information".

Lets say when you get down to doors #1 and #49, Monty brings in someone new, with no information and says pick a door. For that new person, standing right next to you, doors #1 and #49 have a 50-50% chance, while for you they are a 2% vs 98% chance.

How can door #1 simultaneously have a 2% chance for you and a 50% chance for Bob? The answer is that the chance is not a single fixed property of the door itself- which is hard to wrap ones head around.

And for that matter, Monty Hall himself knows one of the doors is 100% and the other is 0%.

The situation is now counterintuitive in the other direction: if Monty Hall had opened those 48 doors at random and they just happened to not contain the car, then there is no advantage to switching, though many people would insist otherwise.
I've never been happy with that explanation. I don't get why the host would not just open a single door, that's what the host does in the other scenario to me.
The best way that I've thought about it is like this:

You pick a door, then Monty let's you switch to the two remaining doors and if the car is behind either of them you win.

Obviously choosing the two remaining doors is better.

The trick is to realize that Monty showing you the contents of one door and letting you choose the other one is identical to Monty letting you choose both the remaining doors.

This is the best explanation I've seen for the problem so far. Thank you
Even simpler - you pick, knowing nothing, so there's a 2/3 chance you're wrong.

If you're wrong, Monty points to toward the right door.

So you should switch.

I think the great advantage of "simulation", for the programming-literate, is not that you can simulate your way to a correct answer, but that the process of creating a simulation is likely to show you the error in your reasoning.

As a young teenager, I encountered the Monty Hall problem for the first time, and I didn't believe that the "analytical" answer was correct. I decided to simulate it by programming. In the 20 minutes it took me to write a simulation, I went from complete incomprehension to a full understanding of why I got the results I got. Programming a simulation of the problem forces you to write out the algorithmic significance of "Monty reveals one of the goats".

The Monty Hall problem is a fun one to code up, and yeah, there are otherwise smart people who refuse to believe it.

I coded it up in F# https://github.com/jackfoxy/LetsMakeADeal to convince one of the founders of a start-up I worked for. He just grunted and walked away. Pretty sure he still doesn't want to hear about Bayes' Theorem.

Just start with an infinite number of doors and move backwards from that.
The Monty Hall problem is especially unintuitive if you've ever watched Let's Make a Deal, since the problem set up is oh so close to, but not exactly, the set up of the Big Deal in the show. It's too easy to conflate the rules of the show with the math problem, which will lead to confusion.

I think seeing the results of a simulation also elucidates the set up of the math problem vs reading a proof.

Yeah, I feel like the Monty Hall confusion goes away if you are explicit about the rules:

"Hall will always open one of the two non-chosen doors and will never reveal the prize"

I think most people who don't understand the problem miss that critical detail.

No, I don't think that detail makes it any easier. I know that but I still really can't accept the correctness of the Monty Hall strategy (I have to basically just take it on faith and stop trying to understand it). I was trying to put my finger on why, and I think it's this.

After Monty eliminates one of the three doors, then the prize is behind one of the two. If someone were to come in this point, with no prior knowledge whatsoever, their chance of picking the correct door at random is 1/2. And that is still true even if they pick the door which our contestant is being asked whether or not to switch from! This is a real mind fuck to try to accept, that the same state of what's behind each door leads to different odds of making a correct random choice, depending on when you make the choice.

I honestly don't think I'll ever be able to "get" the Monty Hall strategy. I think I get why it works (choosing to switch means you're going from a 1/3 probability to 1/2), but it makes no sense at all. It seems like even if you choose to stay on the same door, your probability is 1/2 (the same as if Joe came in off the street and chose the same door as you). Like I said, I just have to take it on faith.

The Monty Hall problem is small enough to enumerate all the out comes on paper. If you then test all the different scenarios (not many) and compare switching to not switching - switching produces a slight edge. So it can be done without a computer and not a great deal of effort. I am not a mathematician so it was the only way I could prove it to myself at the time.
You can demonstrate the Monty Hall problem solution analytically with Bayesian statistics using prior probabilities, no need to go all the way to Monte Carlo methods.
As a proof the math is entirely sufficient, but for those who may struggle with it the simulation is persuasive.
This was how I convinced someone (RIP) by really stressing every element in the definition of Bayes’ theorem and probability space.
I'm curious. When those people see the simulation, do they then go back to the analysis and uncover their mistaken reasoning? Or do they just continue to reject the analysis but begrudgingly accept the outcome of the simulation? The analysis of the Monty Hall problem is so very simple I find it very odd to staunchly reject it but then be persuaded by the simulation.
> is so very simple

Years ago, when the Monty Hall problem was not well known, I've seen with my own eyes that it was hard to convince some very smart people of the correct answer. Indeed, after being convinced through simulation or exhaustive enumeration of the decision tree or some other way, they would go back and see what went wrong with their initial analysis.

This was how I built an intuition into the Monty Hall problem as well! Wrote a little app that simulated it a decade or so ago when I was discussing with friends!
Something that makes this a lot more intuitive is to increase the number of doors. If there's 100 doors, the host will open all remaining doors except 1, and they will never open the door with the car behind it, then one has a 1% chance of winning the car if they don't switch doors and it would happen only because they initially chose the door with the car.
This isn't so mysterious once you learn a bit of do calculus and realize that observation and intervention are fundamentally different things.
Why would you not - analytical solutions are the rare occurrences might as well approach everything with simulation...
While I agree we should leave the correct answer to simulation, analytical approximations are often surprisingly close and have the benefit of being intuition-building.
You learn a lot more from generating an analytical solution than a simulation, so it's usually worth at least taking a stab at it analytically before jumping to monte carlo methods.
(preface with "in today's world")
Ironically, this reminds me of a story (folk tale?) about Von Neumann himself.

A colleague told him about the Two Trains Problem (https://mathworld.wolfram.com/TwoTrainsPuzzle.html), and Von Neumann replied with the correct answer. When his colleague said, "Ah! You figured out he trick!", Von Neumann replied, "What trick? I just summed up the distances in my head!"

Some of us suck at math.
Math is often much more fun and compelling for some people when you both theoretically prove something works and then also convince yourself of the same thing via a numerical experiment. I’m pretty good at math proofs (pure math PhD, wrote some books and papers), but I still love to do numerical experiments. It’s fun, and you also set yourself up to be able to easily ask different questions that may be very hard to answer theoretically.
For me it's a case of, "see, what I do is powerful after all!" after a 5 minute analytical proof matches 3 hours of simulation work. :-)
Also certain unintuitive things in math/statistics (like the Monty Hall problem) because a lot clearer when you write up a quick simulation.
Some of you might have just suffered from poor math education. I don't believe anyone capable of learning to program competently lacks the cognitive horsepower to do math competently with more or less equivalent ease. Many do however lack the training.
Part of the problem is that the basic statistical model simply neglects to differentiate between observing and doing, which changes the odds. This is very important when trying to reason about causality. When you observe an association like your thermometer shows a high number when it's warm out it's one thing, but when you set your thermometer to a high number you won't get any warmer. Whereas if you warm the room, your thermometer will rise. This symmetry breaking is captured by something called do calculus.
The thing about math is that you can do things in multiple ways.

Theory is useful but so is experiment.

I think about it this way

p(th) = p(t) p(h)

p(ht) = p(h) p(t)

Hence p(th) = p(ht) regardless of coin imbalance as long as both events actually will happen. QED.

Yeah this is simpler. You're just throwing away every pair that isn't a TH or HT
> It's not difficult to notice that

Look I found the mathematician

Some folks have more faith in their ability to derive proofs than write simulations and vice-versa.
> Why would you test it

Is this a wrong way to get a right answer?

> Probability of two heads: pp > Probability of two tails: (1-p)(1-p) > Probability of head followed by tails: p(1-p) > Probability of tails followed by heads: (1-p)p > > It's not difficult to notice that if you remove the first two, the last two form a 50/50 distribution

Very nice way to illustrate why throwing out the duplicate sequences gets back to a 50/50 distribution.

You're of course right, but maybe 1% of the population understands that, while 100% understands the practical test.
Because that's how science is done, theory then practical?

Maybe it's just a meat space thing, but even if the math gives us the answer it still only feels "final" or "true" to me when we've actually tested something out.

Put another way p*(1-p) = (1-p)*p

So the probability is the same. Whereas p doesn't equal (1-p) unless p=0.5

> Why would you test it?

Why not?

The fact that you can show something with a mathematical equation doesn’t make other demonstrations any less cool.

Wow. As usual, Von Neumann makes it look easy
You should change your name to ChocMonteCarloPy :)
That's amazing, but I guess it won't help when the person can choose the bias?

Because according to the study the person can choose the bias by choosing which side start up.

So if the person wants tails based on what you've said, they should always

1. Do the first throw starting tails up.

2. If the first one is tails, then they now want to start second one heads up.

3. If the first one is heads, they will want to try and get heads again to dismiss the results. So they will do heads up.

So assuming for example that they have an ability to control bias 75% vs 25%.

Then there would be 75% chance of getting first as tails. After that 75% chance of getting heads.

So they will have 56.25% chance of getting it right the first 2 rounds.

The worst case for them would be if they get heads first (25% chance), and then are unable to get heads again. Which would be another 25% chance so 6.25% odds to lose with the first round.

So 56.25% chance of winning the first round of 2, or 6.25% losing and 37.5% of having to try again.

And I think the odds would converge at somewhere around 90% to 10%. I didn't do full calculations here, but overall it seems this strategy would increase the bias even more.

> That's amazing, but I guess it won't help when the person can choose the bias?

Alice writes on a piece of paper whether to use the result from the first or the second coin, Bob flips the coins however he likes, then once there are two different sides of the coins up, Alice turns over the paper and reveals to Bob which coin contains the result.

Though I guess that unnecessarily complicates the procedure – maybe Alice can just write "heads" or "tails" on a note and then Bob flips without having seen the note. It essentially replaces the second coin with Alice's mind which hopefully doesn't suffer from the same known bias.

If you're going to go that way you can skip the coin flip entirely. Just get both of them to write heads or tails on a note and then compare. This technique is used in some crypto projects, except instead of writing on a note you share cryptographic commitments.
But they need to remove the possibility of a psychological guessing game. E.g. Bob could've researched before hand that people are 55% likely to pick heads if they can pick by themselves.
That doesn’t remove the possibility of a psychological guessing game, just makes it more convoluted. If Bob knows Alice will pick first, he can still bias the results.
Inconceivable!
At this point you can just play odds and evens: one person picks odd, the other picks even, they both hold up either one or two fingers behind their back, reveal them at the same time, then sum the result. This prevents the randomness from being in any one actor's control. If you're worried that your brain's RNG can be gamed, then put an odd-denominated coin in one hand and an even-denominated coin in another, and mix them up so that even you don't know which hand has which.
I can feel the difference between denominations of my local coins no problem. What you need are a pair of coins with an odd year imprint and an even year imprint.
We still need a study then to confirm that when people try to mix the coins in their hands like that, it would be random enough. And that would take another year...
Suppose Alice needs to take the coin first to herself, to use the aforementioned strategy without intentionally introducing bias, and then using result of that, which would determine whether the first or the second result from Bob would be used. Because otherwise Bob may be able to make psychological "guesses".
There are nice protocols like this that don’t require anyone to visibly flip a coin. See, for example:

https://en.m.wikipedia.org/wiki/Commitment_scheme

° Put the coins in a cup

° Shake the cup vigorously and dump coins on the table

° If coins match, go back to step one

° If coins are opposed take the result of the southernmost coin

You can solve this easily by always flipping with the same side (doesn’t matter which) facing up for all flips.
There is skill to coin flipping. You'd need to blind the flipper, either physically blindfold or make it so they don't know which result is the positive outcome ahead of time.
Or ask the competitors to flip the coin in a manner they doesn’t allow for skill, like put it in a Yahtzee cup and toss from there.
I was imagining spinning the coin with a flick of the finger. That doesn't seem to be gameable to me, but I supposed you'd need to do a lot of flicks to see if flicking the head side or tails side matters. I'd think there's no way a coin can be more likely to spin an odd or even number of times before falling, but weirder things have happened.
Unless they can also introduce bias using strength/technique of the throw.
The final calculation is easy 56.25/(56.25 + 6.25) = 90%, unless the persons skills change between rounds or something.
Yeah, thought so as well, interesting how easily those numbers worked out, but then again it's because 75/25=3 and 3x3 = 9 so the final difference must be 9x between the probabilities or 100 / (9 + 1).

I was still lucky with the numbers as for example with 80% vs 20% it would've been 4x4=16 and so 1 to 16 comes to 100 / 17.

But wasn't the bias in the paper something like 50.5% vs 49.5%?
Yes, but I used more extreme numbers for ease of calculation and to clearly indicate the direction of a probability.
That's really cool. The key insight is that "The reason this process produces a fair result is that the probability of getting heads and then tails must be the same as the probability of getting tails and then heads, as the coin is not changing its bias between flips and the two flips are independent." So you have to make sure you always start with the same side up.
You can load a dice, but you can't bias a coin: http://www.stat.columbia.edu/~gelman/research/published/dice...
I love the math/stats history around gambling-related things, thanks for mentioning this. This method assumes the flipper can't introduce bias. ET Jaynes in his book Probability Theory also mentions that it is easy to learn to flip a fair coin in such a way that the result can be predetermined. I searched a tiny bit for this but couldn't find what he was referring to though.
I’ve heard that, with many hours of practice, dedicated amateurs and many famous magicians are able to do this kind of thing. I wouldn’t call it sleight of hand, but it is similar, although it may fall under that category broadly. I’m not a domain expert but I was taught some simple coin tricks as a child by my artist mom’s artist friend who ran the local frame shop. I never tried or thought to try to favor the coin flip or introduce bias, but it’s definitely a skill that can be acquired.
I remember at one point in my childhood learning a trick where it looks like you're flipping the coin, but you're really only causing it to rotate and wobble, meaning it's guaranteed to land on whatever side faced up as you tossed it. I don't remember how I did it though, and a few minutes of trying to recreate the effect has failed.
Lots of practice, use your thumb to hit the edge of the coin to give you more control, aim for a specific spot so you use the same amount of force and control the amount of times it flips. You can also use a surface that absorbs more so the coin is less likely to flip after hitting it.
The VN debiaser is very simple but it's not very efficient-- it loses a lot of your randomness.

Under the same IID assumption you can take N flips that returned M heads and map them to the N choose M possible ways that could have happened. The result will (under IID assumption, even in the presence of bias) be a uniform number on the range [0..N choose M). The ctz(N choose M) trailing bits can be used directly (as they will be uniform) but the rest would have to be converted to binary via something like an arithmetic coder or rejection sampling.

The result is muuch more efficient.

Less directly, VN debiasers can also be stacked. Each debiaser outputs three streams: the normal one, one that says if the normal one output anything, and one that says if it got HH or TT. Then run VN debiasers on those. Though it takes a fairly large tree to extract most of the entropy.

Also interestingly, this extends beyond a two-sided coin, to any number of possible results, like a die with N sides.

To get a fair result from a biased dN: Roll it N times. If you don't get all N distinct results, restart. If you do, then the first of those is your final result.

Also-related are all the problems like "simulate an 11-sided die with a 6-sided die", where the solution involves identifying certain rolls that should trigger a retry. [0]

In both cases it has to do with shaping the combination of multiple rolls using some knowledge of outcome-symmetries and overlaps.

____

[0] In this particular case, that's something like: Roll the die twice to get rolls A and B; map those a number X in the range range [1,36] using x=(((A-1)*6)+B); if X<=33 then return (x%11)+1 which will be equally likely in the range [1,11]; if X>33 then start over.

This is very interesting, but it assumes a coin is biased the same way every flip.

If a coin is more likely to land on the side it starts on, then the bias can change between flips. To fix this, we just need to make sure the coin starts the same side up before every flip.

I absolutely love that even with a corrupted system you can use properties of the system to ensure just outcomes.
This works only if the bias is constant/independent of the result of the previous flip.
The same principle is used in embedded systems as the base of a prng when obtaining randomness from a possibly biased or imperfect source of noise (eg adc low bits or uninit memory values).

I find it easier to understand if you say “instead of using the level/value as the source of randomness, use the transition from one level/state to the other as the bit of entropy.” (edge-based instead of level-based) I.E. instead of head is 0 and tails is 1, head to tail is 0 and tails to head is 1, and the other transitions are disregarded.

I'm confused, how does this help? If coins are biased to land same-side up, then don't I always have an advantage by guessing whatever side is up before the first throw?
I see what you're getting at, and it's subtle! To simplify the discussion, let's assume we always start out with heads up.

You're right that the first coin is more likely to end up heads. But so is the second coin, and if both occur, that would invalidate the pair of tosses. Now, imagine you guessed tails despite the coin starting on heads. If the first toss lands tails, the second coin is still more likely to land heads, which keeps the pair valid.

In other words, whatever you gain by guessing the side that's up on the first coin, you lose on account of the second coin having that same higher probability of invalidating the pair.

----

Using extreme numbers, in case that makes it more clear: imagine a coin that has a 99 % probability of ending up with the same side we start with, and – for simplicity of exposition – we always start with heads facing up before the toss.

If you guess heads, and the first coin lands heads, then there is a 1 % chance that you win, namely that when the second coin lands tails.

If you guess tails, and the first coin lands tails, then there is a 99 % chance that you win, namely that when the second coin lands heads.

The two outcomes of the first coin (99 % and 1 % respectively) perfectly balance out the two valid outcomes of the second coin.

Ah fascinating! Thank you!
The probability of [HEADS, TAILS] is always the same as the probability of [TAILS, HEADS], no matter how the coin is weighted.
I get that but I don't see how it answers my question?
You're flipping pairs of coins until you get either "HT" or "TT". So the only two possibilities are:

1. keep flipping until you get HT (and so you choose 'heads') 2. keep flipping until you get TH (and so you choose 'tails')

Since HT and TH are equally likely, results 1 and 2 are equally likely, i.e. there's a 50% chance of choosing heads, 50% change of choosing tails.

Basically rejection sampling, which is about obtaining variates from one distribution using variates from another. We have samples from p-coins (where p is the probability of flipping heads) and we wish to generate samples from 0.5-coins.

https://www.newton.ac.uk/files/seminar/20100623134014301-152...

That is only guaranteed to work if subsequent tosses are independent of each other. TFA suggests that they are not.

[edit]

If you always start with the same side of the coin face-up, then the tosses will be independent of each other, but if you e.g. always flip it once or always keep it the same before the next toss, then they are not.

TFA does not suggest that. Even if a coin is biased towards the side it started on, this wouldn’t carry over from one flip to the next.

It would be important to start the flip on the same side, but that’s doesn’t make the second flip dependent on the first

Hah, realized that after I posted and edited while you were commenting. It's a good point. My original assumption would you'd just pick it up and flip it; not attempt to put the same side face up each time.
Couldn't you still game that by choosing the starting side? If you want heads, and you start on heads and get a tails for the first round, you could start on tails on the second round to try to get another tails and reset to step 1.
That only works if the result of the coin is independent of whatever it showed on the prior flip.

(You might say "of course it is!", but if that's your approach to the problem, you should be aware that biased coins don't exist...)

I take it a double headed coin doesn't count as biased for this algorithm. (because it's too biased to still be considered a "coin", for probability's sake.)

Otherwise it never halts.

Ah, but this assumes a fixed unfairness; with this result, you could pretend to do the von neumann method but change starting sides on each flip, giving a biased result.
How come when the results are the same you have to go to 1 and flip twice again? Can you just toss another one and use the last two results?
ok but if the coin tends to land as starting then

1. starting from heads

2. flip heads

3. flip heads

4. starting from heads

5. flip heads

6. flip tails

take 5 = heads?

heads should still be more likely to occur than tails under this scenario, although, Zeno-like, with decreasing likelihood approaching zero over time?

on edit: of course Von Neumann's process has more restrictions, leading closer to fairness.

If you always start with heads the method works out. The key is that the first and the second toss need to be independent so that HT and TH have the same probability. If you influence the second toss based on the first one it no longer works.
> 2. If you get the same result both times, goto 1

Well, yes. The wiki description basically states that you get to throw away results of coin tosses in some particular cases.

In that sense, it's not really any different from just making up the results of the coin tosses entirely. There's 10000 different ways to make your data garbage.

you can do that all you want my coin is the same on both sides
Pocket change is Manchester encoded by default.
With all due respect to Von Neumann, intuitively I would change it to use the information in the two coins: one for (X, Y) and another for (Y, X). Not the first.
Yes, and as the second coin carries no information (because we are focusing now on sets of two different consecutive outcomes) both your and JvN's protocols are equivalent.
Flip till I get the side I wanted