Hacker News new | ask | show | jobs
by dzdt 3894 days ago
I don't get this yet. Their lead paragraph says:

> Jack takes a coin from his pocket and decides that he will flip it 4 times in a row, writing down the outcome of each flip on a scrap of paper. After he is done flipping, he will look at the flips that immediately followed an outcome of heads, and compute the relative frequency of heads on those flips. Because the coin is fair, Jack of course expects this empirical probability of heads to be equal to the true probability of flipping a heads:0.5. Shockingly, Jack is wrong. If he were to sample one million fair coins and flip each coin 4 times, observing the conditional relative frequency for each coin, on average the relative frequency would be approximately 0.4.

If I try to work this out, I write down the 16 possibilities for four coin flips:

    TTTT, TTTH, TTHT, TTHH,
    THTT, THTH, THHT, THHH,
    HTTT, HTTH, HTHT, HTHH,
    HHTT, HHTH, HHHT, HHHH
I count 24 instances where H occurs before the end of the sequence, 12 of which are followed by H and 12 of which are followed by T. So I get the expected 0.5 outcome.

The authors do some other calculation, and I don't understand what they are thinking. Can someone explain?

2 comments

Here's an explanation that addresses this apparent inconsistency: http://andrewgelman.com/2015/09/30/hot-hand-explanation-agai...
The linked explanation seems to be that if you do the probability wrong in a certain way, you come up with something below 50%.

Here's one way to reproduce the 40% number they get in the paper. Take a sequence of four flips. Consider five cases:

1. 0 heads. Probability of head following a head=0

2. 1 head. Probability of head following a head=0

3. 2 heads. Probability of head following a head = 1/3

4. 3 heads. Probability of head following a head = 2/3

5. 4 heads. Probability of head following a head = 1

Now if those five cases were equally likely, then what would be the expected number of heads following a head?

Answer: (0 + 0 + 1/3 + 2/3 + 1)/5 = 0.4

Is this what they assume gamblers are using for 'empirical probability'? I can't tell.

Haha, exactly my question! I wrote out that same table and got the same results.

Their calculation seems to be something like

sumOverRunLengths[P(runLength)*(1 - runLength)].

I also wish I knew how to get HN to keep my newlines like you did.