Hacker News new | ask | show | jobs
by scottedwards 4778 days ago
Great effort, and I certainly hope more coders will get into statistics (most I know are only interested in machine learning). However, I think your definition of 1.3 "Confidence Interval around the Mean" could be improved. You state:

"A confidence interval reflects the set of statistical hypotheses that won't be rejected at a given significance level. So the confidence interval around the mean reflects all possible values of the mean that can't be rejected by the data."

That seems a bit vague and perhaps confusing. Might I suggest something more like this:

"The confidence interval specifies a range (+/- a multiple of the above standard error [SE]) around our estimate of the mean (x-bar) such that: if we repeated our sampling process an infinite number of times (i.e. with the same sample size and forming a new x-bar and SE each time [and therefore, a new confidence interval]), Confidence_Level% of those intervals would contain the population (true) mean."

In addition, I think in this case, at least, there are no assumptions about the data to worry about, given a sufficiently moderate sample size due to the Central Limit Theorem (I'm confident about that in the case of the mean (x-bar), but I'll leave it up to others to correct me if I'm wrong about this applying to the standard error (SE)).

1 comments

Confidence intervals are inherently confusing. I have yet to hear a definition that is both correct and easily understood and remembered.