|
This is all scribbled in a notebook, so there's a good chance that it's wrong, but bear with me on the arithmetic. As a first guess, let's agree that in expectation you earn $0.51 each time you flip the coin, so it should take you about 20 tries to reach $10. Let's do something better, though. EDIT: The previous paragraph is totally wrong. Thanks, tome :) Let's build a confidence interval with alpha = 0.01 so that (1 - alpha) = 0.99. First, we'll need some trials. For that, I wrote a program that flipped a weighted coin and played the game until it reached $10 using the rules that you described. I recorded the number of coin flips required in each of 15 trials: 298, 84, 268, 2712, 110, 66, 42, 128, 84, 48, 280, 80, 64, 42, 234 We'll need the sample mean, X_bar = 302. Now, we'll compute the Z-score so that we can build an interval in which the true mean (mu) lies with 99% probability: P(-z <= Z <= z) = 0.99 We know that Z = (302 - mu) / (sigma / sqrt(n)), where sigma (the standard deviation) = 650 and sqrt(n) = 4. I'm rounding. Therefore, Z = (302 - mu) / 168. Now, let's look at the cumulative distribution function Phi(z) and note that if Phi(z) = 1 - (alpha / 2) = 0.995, then Phi(z) ~= 0.997, the approximate cutoff for the 3rd standard deviation. Thus, z ~= 3. Thus, we have that P(-3 <= (X_bar - mu) / (sigma / sqrt(n)) <= 3) = 0.99, so P(X_bar - 504 <= mu <= X_bar + 504) = 0.99. Therefore, I am 99% confident that the true mean, mu, lies on [X_bar - 504, X_bar + 504] = [302 - 504, 302 + 504]. That's a really wide range, and seemingly completely unhelpful for the purposes of betting. More sample trials would teach us more and lead us to a smaller interval since we expect that within some large number of trials we will converge on mu. |
I wrote a program and did ten thousand trials; the 100th largest hitting time was 5578, which is my estimate of the 99th percentile of the distribution, and thus the answer to the version of the original problem where we don't have to fix the number of coin flips ahead of time.
The median hitting time from my simulation is 152; the mean, 506, the standard deviation, 1155. Yes, the standard deviation is larger than the mean! The distribution has a very long right tail.
I'm fairly confident (in an informal sense) that the median of the true distribution is somewhere near 152, but I'm not confident about the mean or standard deviation. A lot of distributions in problems like this have tails which decay only like power laws, which makes estimating the mean and standard deviation from a sample very difficult. (I'm not saying that the distribution is a power law, though; it's hard to identify those just by looking at the data.)