Hacker News new | ask | show | jobs
by sshrin 4763 days ago
Thanks for this explanation. I understood most of it but could you explain why you should normalize using 1/sqrt(n) and why doing so makes the result converge in distribution?
1 comments

For a sequence of independent random variables with the same variance, X_1, X_2,..., we have

  var( (1/sqrt(n)) * (X_1 + X_2 + X_3 + ... X_n) 
    = (1/n) * (var(X_1) + var(X_2) + ... var(X_n))
    = (1/n) * n * var(X_1)
    = var(X_1)
This holds for any n, which means that, if you normalize by 1/sqrt(n) instead of 1/n, the "randomness" never vanishes even when n gets infinitely large. If you normalize by something bigger than 1/sqrt(n) the variance blows up, and if you normalize by something less than 1/sqrt(n), the variance collapses to zero so you get something concentrated at a single point.

The CLT tells us more than that, it actually tells us how the randomness is distributed when n gets very large, which is pretty remarkable when you think about it. (and it holds under much weaker conditions than what I mentioned above, it's just that those assumptions are probably the easiest to understand).

Thanks a lot for the explanation.