Hacker News new | ask | show | jobs
by yummyfajitas 3406 days ago
The p-value is defined as p=P(evidence seen | null hypothesis). The standard deviation is only relevant if it is required to compute that number. You can run NHST on distributions without a p-value, e.g. a Cauchy distribution.

You might need a standard deviation if you want to do some naive Z-test based on the CLT approximation (since the normal distribution requires a standard deviation), but that's not what XKCD was describing. XKCD was describing an exact test using the true distribution.

1 comments

True distribution is not what they where using. They assumed the dice had zero bias which is never true for any physical system.

I can say this is a dice and therefore it should have distribution X in theory. But, that does not mean it's actual distribution is X without testing. Further, even after testing nothing says the distribution will be unchanged.

Note: The above seem pedantic, but it has significant real world implications.

All you're saying is models are imperfect. That's true. That doesn't mean you need a standard deviation or more than 1 sample.

In this case, the model of 1/36 odds of rolling 2x6 would have to actually be 1/20 (or smaller) to invalidate this test. Do you find it plausible that the bias in 2 die is that high?

In that specific case yes, because there was no dice roll it was just a comic.

In a wider context that single data point is evidence that the detector was tripped or was not tripped. But, unlike a Bayesian the frequentest does not say they then know the actual probability involved and they don't update their priors. Because, to do it correctly you need to pick a P value and a model before doing the test.

Significant: https://xkcd.com/882/ makes a similar mistake by assuming a frequentist would accept that study design before running the tests. Multiple tests require more evidence, though when multiple groups are involved and not all publish you do get this problem.