Hacker News new | ask | show | jobs
by rcxdude 1232 days ago
It's the fact that you need rare samples. The power of sample size is that you can see finer details relative to the fully zoomed out view. If you are interested in an effect which is rare or want to find a small difference between two effects, then you will potentially need a much larger sample size. (For the extremes of this, see the truely gigantic number of samples (trillions+) that are taken in high-energy physics experiments like the LHC: they are looking for very small differences in very rare events. This is also related as to why standards for statistical tests are much higher in this field)
1 comments

I don’t actually care about the tails. I’m fine cutting off the comparison and treating sufficiently rare events as having an expected value of 0. And indeed, the bins that show up with “errors” (ie deviating > 5%) are the ones where events are reasonably expected. The tails are indeed always within 5% of expected.