Hacker News new | ask | show | jobs
by east2west 1801 days ago
Already reached the conclusion when I saw a genetic researcher presenting his p-value < 10^-40 as better than < 10^-10. I kept my mouth shut because I didn't want to ruin the poor guy's moment in the sun, but I knew it was time to get out.
1 comments

My naive understanding is that "smaller p-value" == "more likely result is true".

I know there's always more nuance in statistical reasoning, but the first number is vastly smaller than the second one, right? Is it just that both are hilariously tiny and not credible? Or is there no additional value after you get into the one-in-billions territory?

It would be, but such an imbalance of p-values is unrealistic. 10^-10 probability? If your probabilistic model includes even a one in a billion chance of messing up (10^-9), a p-value of 10^-10 is already too small. That’s before you look at 10^-40... so they are probably both wrong.

A nice demo of this effect is DNA matching in criminology. Although DNA matching of suspects to DNA samples can be insanely accurate, in practice it is limited by the incidence of monozygotic (identical) twins, which is about 3 in 1,000. You cannot be more certain than this that you got a match, essentially.

Exactly, numerical errors could easily have accounted for the difference between already tiny p-values. The point isn't that the smaller p-value isn't better than the bigger one, it is, but that small significance should have been attached to the difference.

This example is a gnome-wide genetic association study. Every genetic variations are tested, so at least 500K or more linear regressions were performed. This many statistical tests could lead to many false positives just by chance, so one must do multiple-testing corrections. The end result of multiple-testing correction is much bigger and therefore worse p-values. Hence the drive toward ridiculously tiny p-values.

Yeah I'm also mystified by that comment. You are correct that smaller P is better. Those near physics level P-values are not totally unheard of for genetics either, because they have very large databanks with hundreds of thousands of data points in them and the ability to do large analyses over them, so they can obtain a lot of statistical power.
Precision in p-values that small is more or less meaningless in almost all cases, because any violation of model assumptions will result in p-value imprecision far greater than 10^-10. p-values are (almost always) approximations based on an approximate model, and the variation between the model and reality is probably more than 10^-10.

Some tiny aspect of the real process that your model falls to capture might mean that that 10^-10 is actually 0.001, and 10^-40 is also 0.001. In complex biological fields it's fair to assume that there are always such tiny aspects.

You're right. The numbers are too small to be plausible. I read on Scott Alexander's blog about 5-HTTLPR that in genetics they can get very low P values relative to most life sciences, but 10^-40 indeed seems far too low for any plausible experiment. I guess even in particle physics they don't go that low.
> My naive understanding is that "smaller p-value" == "more likely result is true".

I think you're making the classic Prosecutor's fallacy: https://en.wikipedia.org/wiki/Prosecutor%27s_fallacy. In my experience, smaller p-value tends to be more of a measure of sample size than anything else, or an overly restrictive null distribution that is almost certain to be rejected.

I estimate the risk of human error (chose the wrong modeling assumptions, bug in data processing code, etc.) at least ~1%, so there really isn't any point in claiming any statistic that is smaller than that.
P values in particle physics are much, much lower than the base human error rate though. Unless you think those are wrong?