Hacker News new | ask | show | jobs
by thaumasiotes 1832 days ago
You want to consider, based on a distribution of potential fly populations, what the odds are that sampling two of them will fail to capture two specific ones. For example, you can state with certainty that the population is above 3.

Closely related (though not quite the identical problem): https://www.johndcook.com/blog/2010/03/30/statistical-rule-o...

Applying that directly, we would estimate that the odds of a fly in your house being marked are less than 150%, which tells us that you should be taking larger samples.

(With only two marked flies, this estimate will always be less informative than the fact that you sampled n flies -- it will tell you that the population is probably greater than 2/3 of n, while the sampling procedure tells you that the population is necessarily at least n+2. But as you mark more flies, the estimate will be more informative.)

1 comments

Mostly because I thought it was fun, a worked example where you mark 100 different flies and then catch 100 unmarked flies (releasing each fly individually immediately after catching it; maybe you caught the same fly 100 times):

By the rule of three, we estimate the probability that a fly in your house is marked at p < 3/100.

We can also model the probability that a fly in your house is marked as 100/n, where n is the number of flies in your house.

Then 100/n < 3/100 and n > 100*100/3 = 3,333. There are probably more than 3333 flies in your house.