|
|
|
|
|
by wch
4536 days ago
|
|
I agree that there is a general problem that researchers go on fishing expeditions with their data. But having a small sample size doesn't make it any more likely to find a false positive. We've already agreed that a small sample size doesn't make it any more likely to find a false positive for a given hypothesis. This is true for H1, H2, H3, etc., where each of these is a hypothesis. Therefore the aggregate effect of testing N different hypotheses is that you're no more likely to find a false positive with a small sample size vs a large sample size. You are more likely to have false negatives with small samples, though. |
|
It does. Try and test a die for load. Let's say your prior probability of the dice being loaded is 50%, because this is a real shady place you're gambling in. You further know (based on the game you're playing) that if your die is loaded, it will land with these frequencies:
Now, you will throw the die on the table a number of times to test it for load. Each throw will give you some evidence. If I've got my calculations correct, landing a 6 nearly guarantees the die isn't loaded, landing one gives you 1 bit of evidence that it's loaded, and landing anything else doesn't tell you anything.Now what is the probability for false positive? Well… With only one throw, you will land 1 one times out of six, giving you a posterior probability distribution of 2/3 loaded, 1/3 genuine (this is as close as you will get to a false positive).
With 2 throws, it's a bit more complicated:
And so on, as you throw the die over and over again. I'll spare you the calculations, but the simple thing is, the die will get more and more chances to eventually land a 6, rendering the "definitely genuine" observation more and more probable (1 - (5/6)^number_of_throws), and the false positives less and less believable.Okay, this is a contrived example. But sufficiently large sample sizes do indeed reduce the risk of false positives. It's just that some result are so clear cut that they don't need large sample sizes to reach a conclusion reliably.