Hacker News new | ask | show | jobs
by zug_zug 659 days ago
> Thus, if and only if the lady properly categorized all 8 cups was Fisher willing to reject the null hypothesis – effectively acknowledging the lady's ability at a 1.4% significance level (but without quantifying her ability).

Important to realize though, that failure to categorize all 8 doesn't prove anything either. It just means this one experiment isn't conclusive in itself (at 95% confidence).

It's good to be aware of how easy it can be to get a false result by chance, but it's imo a worse statistical sin to propose that not proving something is proving the opposite (a mistake I see quite often).

3 comments

Also, you need to consider your prior probabilities. If you performed an experiment that showed, 0.001<p<.05 that the sun has spontaneously stopped undergoing fusion, I wouldn't be very worried.
This is generally good advice, but isn't it inappropriate in this specific instance?

The lady's claim was (allegedly) that she has a perfect ability to distinguish between the tea-milk orders, so in that case even a single failure is indeed enough to reject her claim.

We can't rule out her success rate being significantly greater than 50-50, but even a single failure puts some bounds on her maximum success rate.

>> The lady's claim was (allegedly) that she has a perfect ability to distinguish between the tea-milk orders

I believe you added the word "perfect" which makes a substantive difference. I think this highlights the complications that get involved when trying to turn a simple proposition into an meaningful claim:

- Can we prove that person X can observe taste of tea with > 50% reliability with 95% confidence (what Fischer did)

- Can we prove that person X can observe taste of tea with 100% reliability with 95% confidence (not statistically possible)

- Can we prove that person X cannot observe taste of tea with > 50% reliability with 95% confidence (only possible if this person guesses wrong more often than randomly)

- Can we prove that person X cannot observe taste of tea with 100% reliability with 95% confidence (just need one example)

The probability of guessing all eight correctly is 0.5^8 (or roughly 0.39%). The chances of such a thing happening by mere fluke are quite slim. Now personally, I would have preferred a few more glasses to be even more certain, but hey, for all practical purposes those results do seem fairly credible.
Minor, but no it's 1/(8 choose 4) = 1/70 because there are 4 tea-first, 4 milk-first, and the lady tasting tea knows this.

Once the locations of the four tea-first cups are decided, the locations of the remaining milk-first cups are completely determined. (And there are 8 slots for those first 4 cups, hence 8 choose 4).

Ah yes, I stand corrected! In that case I suppose more trials really would have made the results more convincing, too....
I wonder whether pouring them together in addition to the first or second cases would be useful for helping to ascertain how keen the sense was.