|
|
|
|
|
by ckelly
1225 days ago
|
|
It has always surprised me that many technology professionals (and business professionals in general) don't have a strong intuition for the power of sampling. For example, in this case, the author states:
"With 100 samples, our estimates are accurate to within about 5%.
The magic of sampling is that we can derive accurate estimates about a very large population using a relatively small number of samples.
In the last scenario (100 billion M&MS), we have 1% accuracy despite only sampling 0.00001% of the M&Ms." I bet many would think n=100 would be worthless once the population reaches millions, or especially billions. One HN-related piece of evidence for that is when I pointed out what margin of error would be for a n=164 survey sample, I got downvoted hard!
https://news.ycombinator.com/item?id=8050801 But I saw this hundreds of times talking to customers when I ran a survey sampling product out of YC. |
|
The problem is the sample size relationship with power is at a square, then it involves the effect size and the variance. Quadratic relationships are unintuitive, many order of magnitude differences are unintuitive, and variance is unintuitive. So it’s like a superformula for the human brain to not guess accurately.
I write experimentation software, and with samples in the millions, data scientists still want more power. Then you run some internal experiment with n=20 and it’s like “oh yeah, super significant”.