Hacker News new | ask | show | jobs
by drhagen 3132 days ago
I completely agree that it's important to take this kind of thing into account when approaching a problem like this. As I say in the post, "There are 31 days and one of them has to be smallest. Maybe the 11th isn’t an outlier; it’s just on the smaller end and our eyes are picking up on a pattern that doesn’t exist."

I'll admit that a straight p-value is not the appropriate statistic here. I don't even know how what the perfect statistic for this problem is. A Bonferroni correction is not enough because not only is the 11th of the month the lowest for a particular year--it's the lowest for every year.

I was convinced that this was real when I looked at the first line graph of the post. The 11th is the lowest either every year or almost every year, being 3-5 standard deviations below the mean for the bulk of the last 200 years. That just can't happen by chance no matter how you slice it.

If anyone knows the proper way to calculate a statistic on something like this, I would love to hear about it.