Hacker News new | ask | show | jobs
by carlmr 1456 days ago
How can the 99% confidence interval for time in the first example be 7.391 +- 0.26? Most of the values listed lie outside of that.

I got a mean of 7.395 and a sigma of 0.533 (this is without hte DoF adjustment because these are guessed from the histogram). 2.576 * sigma is the 99% confidence interval if we assume a normal distribution. I.e. 1.373.

In any case we'd also have to consider that we estimated the sigma from the distribution, so we'd have to do an upward correction here: https://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics....

2 comments

This is an old problem resulting from lack of education.

People apply standard deviation without first learning that it only makes sense for data that has standard distribution.

Standard deviation is a prediction and characterisation tool. Knowing that the data set has standard distribution you can characterise entire data set very easily by giving just few parameters of the standard distribution to then allow you to predict other information. A bit like being to tell everything about black hole by just stating its mass, charge and angular momentum.

This only makes sense when the distribution of a special class with the common characteristic. Specifying mass, charge and angular momentum of a chair does not let you predict everything about the chair, it only works for black holes.

If you are not convinced, try calculating standard deviation of number of testicles on humans. Based on it, infer how many humans have one testicle. How many have more than 5?

Maybe it is the confidence interval on the estimate of the mean, not the 99% interval of the whole distribution

Update: yeah, the next paragraph describes the uncertainty of the estimate of the mean, which is not the same as the spread of the distribution discussed in the initial motivation

Ah, that is very misleading though since they're discussing confidence intervals. Maybe some methodology section would be good.