Hacker News new | ask | show | jobs
by joe_the_user 2984 days ago
It seems like your reasoning and the reasoning of the author could be applied to any statistic testing the reliability of a hypothesis, not simply p values. Further, you could mitigate this problem if you knew the prior probability, sure. But how do you expect a bad hypothesis generator to be good at knowing the prior probability. The usual standard is "extraordinary claims require extraordinary evidence." The less likely a hypothesis, the stronger the evidence, measured as p-values or otherwise.

But the thing is the public and the scientific community has to be the one who are going to judge the extraordinariness of a claim. If an experimenter were to wrap their results in their own belief in the likelihood of the hypothesis, the observer wouldn't be able to judge anything. So it seems like experimenters reporting p-values is as good a process as any. It's just the readers of results need to be critical and not assume .05 is a "gold standard" in all cases.

1 comments

> It seems like your reasoning and the reasoning of the author could be applied to any statistic testing the reliability of a hypothesis, not simply p values.

Precisely. That's the point. Hypothesis testing is inherently absurd.

Hypothesis testing is "soul" of science.

What's impossible is thinking that just the output of a single experiment gives hypothesis certainty, or a fixed probability of a hypothesis or anything fully quantified.

You're alway going to have the context of reality. Not only will you have the null hypothesis you'll competing hypotheses to explain the given data.

But the point of science isn't blinding constructing experiments but instead forming something you think might be true and doing enough careful experiments to convince yourself and others in the context of our overall understanding of the world that the hypothesis is true. Common sense, Occam's Razor, the traditions of a given field and so-forth go into this.

Then, hypothesis testing was born in the context of industrial quality control, where the true data generating process is very close to being well-known and deviation from the norm raises a red flag rather than suggests new knowledge about how breweries work.