|
|
|
|
|
by parpfish
91 days ago
|
|
As to ye philosophy of “why” the CLT gives you normals, my hunch is that it’s because there’s some connection between: a) the CLT requires samples drawn from a distribution with finite mean and variance and b) the Gaussian is the maximum entropy distribution for a particular mean and variance I’d be curious about what happens if you starting making assumptions about higher order moments in the distro |
|
The most interesting assumptions to relax are the independence assumptions. They're way more permissive than the textbook version suggests. You need dependence to decay fast enough, and mixing conditions (α-mixing, strong mixing) give you exactly that: correlations that die off let the CLT go through essentially unchanged. Where it genuinely breaks is long-range dependence -fractionally integrated processes, Hurst parameter above 0.5, where autocorrelations decay hyperbolically instead of exponentially. There the √n normalization is wrong, you get different scaling exponents, and sometimes non-Gaussian limits.
There are also interesting higher order terms. The √n is specifically the rate that zeroes out the higher-order cumulants. Skewness (third cumulant) decays at 1/√n, excess kurtosis at 1/n, and so on up. Edgeworth expansions formalize this as an asymptotic series in powers of 1/√n with cumulant-dependent coefficients. So the Gaussian is the leading term of that expansion, and Edgeworth tells you the rate and structure of convergence to it.