| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Pyxl101 3470 days ago

> But simply sampling 10 people is about as good as 10 anecdotes.

That is not generally true. It depends on what kind of statistical assumptions you make, or statistical analysis you conduct. The term statistical power describes our chance of correctly detecting an effect when there is one to observe. If the effect is small, then a larger number of samples are required to achieve a given level of statistical power, whereas if the effect is large, then fewer are required.

I recommend that you consider the question: if 10 samples are not enough, then what specific number of samples is enough? How do you decide? Fortunately, these questions have been studied in the field of statistics.

Let me give you an example. Let's say I told you that you will flip a coin in the air, and when the coin reaches its peak height, I will shout "heads" or "tails". What would you make of it if we ran this experiment 10 times, and I correctly guessed the outcome 10 out of 10 times? Perhaps you would conclude I really can predict coin flips. By comparison, if I called the outcome accurately only 5 out of 10 times, then you'd probably consider my claim false.

But, consider these two possibilities: (1) what are the odds that my guesses are really no better than random chance, and I've just guessed 10 out of 10 correctly by good luck? (2) What if I really am accurate 99% of the time, but I guessed only 5 out of 10 correctly by bad luck? Statistics allows us to evaluate how likely these things are.

If I'm doing the math correctly, then you'd expect someone to guess 10 out of 10 coins correctly just by chance once in every ~1000 experiments. So to see this happen is not witnessing an extraordinarily improbable event; run enough experiments of 10 coin flips and you will see it.

If someone truly has 99% accuracy, then in almost every experiment they will guess 10 out of 10 correctly. They should guess all 10 correctly 90% of the time. A person who is truly 99% accurate will only guess 5 out of 10 flips correctly once every in every 10,000,000,000 experiments. So it is extraordinarily unlikely that you will see someone with 99% accuracy guessing 5 out of 10 coin flips correctly. It can still happen just by chance, but it's really improbable.

Bringing this all back to the main topic, it is possible for a result of 10 data points to count as convincing evidence against the theory that there is a strong effect, such as that musicians are 99% accurate in discerning the type of violin, while it may be inadequate to evaluate whether there is a weak but still-present effect, such as 51% accuracy. Whether the number of samples is good enough depends on how small of an effect you want to measure, and how confident you want to be in your assessment.

A free book is available online called "Statistical Inference for Everyone" which introduces these topics. https://github.com/bblais/Statistical-Inference-for-Everyone