Hacker News new | ask | show | jobs
by chongli 2421 days ago
Because the “why” of statistics has many turtles before you get to the bottom. Measure theory, topology, real analysis, abstract algebra. You need to learn a lot of math before you get a complete picture of the theoretical underpinnings of modern probability theory, which forms the foundation of all of statistics.

Most people just want to calculate a P value or a 95% confidence interval for the mean of whatever they’re researching. They’re not interested in how it all works.

1 comments

You need a bit of analysis, but I'm pretty sure that you can understand p-values for normal distributions of a single variable without abstract algebra or topology. Understanding simple cases from first-ish principles makes it much easier to swallow handwavy explanations for complicated stuff.
You can't understand the why of the normal distribution without Fourier analysis though - which is pretty heavy going for anyone who's not hardcore science or engineering.
Can't you introduce the normal distribution as the limit of the binomial distribution? I think you can prove the central limit theorem without using terribly advanced math.
That's only the most limited version of the theorem which has since been renamed the de Moivre-Laplace theorem [1]. The rabbit hole goes much deeper when you talk about the most general form which works for any set of independent and identically distributed random variables, not just binomial random variables.

[1] https://en.wikipedia.org/wiki/De_Moivre–Laplace_theorem

Sure, but do you need the most general form to build some intuition about p-values? I don't think so.
That’s moving the goalposts. The original claim was about getting a complete picture. A full understanding of the “why” of statistics going all the way to the bottom.
Fourier analysis is not that bad. Most kids in the calculus class at the university I went to had the basics of that explained to them.
I'll bite. Why do you need Fourier analysis to understand the normal distribution?
The only proof of the central limit theorem I know of relies on it. Iterated convolutions tend toward a Gaussian. Essentially you take the Fourier transform of any nice probability distribution, do a Taylor expansion, and Bob's your uncle.

Another child poster pointed out that in special cases (like the limiting case of the binomial distribution) you can get away with other arguments. And you can certainly 'prove' it via simulation. So maybe you don't need the full generality of Fourier analysis in the end.

Iterated convolution was the proof I learned, too. Probably a semantic difference that I don't equate proof with understanding.
Probably a semantic difference that I don't equate proof with understanding.

I would call that a colloquial definition of understanding. Very different from mathematical understanding.

If you ask a mathematician what you’d need to know to understand Fermat’s last theorem, he won’t say “high school pre-algebra.” That’s only enough for you to understand the basic statement of the theorem. It doesn’t get you to the why. To understand the problem involves a deep dive into both algebraic number theory and analytic number theory.