|
|
|
|
|
by pfooti
3924 days ago
|
|
That's actually a more articulate, but redundant codicil to the argument I made in the rest of the post. Multiple tests will result in significance at some alpha, since you just have to test enough times to get a lucky test. There are techniques (outlined in your link), for addressing that, but the central point I think is still cogent. If you have a test of significance that results in p < 0.01, there's a one percent chance that you're rejecting the null hypothesis due to normally-distributed variation in your data. The base rate fallacy is more about interpreting what that p = 0.01 means, and why systematic bias is important to worry about - if you're testing cancer drugs, you don't want to test them on people who don't have cancer. |
|
No, this is absolutely not true. If p < 0.01, then if there is no systematic effect and only normally-distributed variation, you would see this effect 1% of the time. That is, the p is P(data | null is true), and not P(null is true | data). You cannot invert the conditional.
In the extreme case, when the null is true for every test, you will get significant results for 5% of them. Thus 100% of your statistically significant results are false positives, no matter how small their p values.
Given that we do not know what fraction of the time the null is true, we cannot know the chance that we're rejecting the null falsely. But it is invariably larger than p.
This misunderstanding is why scientists routinely overestimate the strength of their evidence and discount the possibility that their results may be flukes.
(Source: I wrote the link provided earlier. Also, the discussion leading to table 1 in this paper is good http://journals.plos.org/plosmedicine/article?id=10.1371/jou...)