|
|
|
|
|
by bonoboTP
2385 days ago
|
|
> The p-value is 1/32, so the null hypothesis is rejected. No, the p-value is defined as the likelihood of a result at least as extreme as the one we obtained, under the null hypothesis. It's not simply the likelihood of the particular result you obtained, as that would always be zero for continuous quantities! (Remember that the p-value's distribution is uniform over the 0-1 interval under the null, so any criticism that says the p-value is almost always small just by chance must be wrong somewhere). So first you need to establish a way to say what result is how extreme. This is very often trivial and quite objective (the more people cured/made sick, the more extreme the effect of the drug). For the coin flip case, one way would be to call results with more imbalanced ratio more extreme. Then in your 3 heads out of 5 case, the (one sided) p-value would be the likelihood of getting 3, 4 or 5 heads out of 5. You can also come up with a different way to define what "more extreme" means (and put it forward in a convincing way), otherwise you can just not talk about p-values. You can keep talking about likelihoods, but not p-values. |
|
Define for me in an objective way what "at least as extreme" is. Let's say I think the string "HHTHT" is extremely indicative of conspiracy. Then the p-value is 1/32 on the measure of "strings of coin flips at least this extremely indicative of conspiracy".
See, this sounds completely ridiculous, but it's not in principle any different from what it done in thousands of social science papers a year. All these supposedly objective procedures have tons of ambiguity. For example:
> For the coin flip case, one way would be to call results with more imbalanced ratio more extreme.
Why an imbalanced total ratio? Why not average length of heads? Average number of occurrences of "HT"? Frequency of alternations between H and T? Average fraction of times H appears counting only even tosses? Given the combinatorial explosion of possible criteria, I guarantee you I can find a simple-sounding criterion on which any desired string of fair tosses gets a low p-value.