Hacker News new | ask | show | jobs
by smallnamespace 3147 days ago
Isn't the unsuitability of the high-dimensional Gaussian intimately related to the fact that for most realistic problem spaces, we actually believe there are really far fewer than the N >> 1 measured dimensions?

A uniform Gaussian presupposes that the variates are either linearly orthogonal, or all have the same linear interaction with each other (in the case of fixed positive correlation).

If your actual problem has dimension 20, but you've measured it with N dimensions, then that means there are strong interactions between your measured variates, and moreover the intervariate interactions do not have a single fixed interaction strength (like a single Gaussian correlation), but probably vary like a random matrix.

This might be related to the Tracy-Widom[1] distribution somehow. Perhaps the the distribution you use to replace the Gaussian should really be something like: first generate a random positive semi-definite matrix as C, then generate random data based on different random choices of C.

[1] https://en.wikipedia.org/wiki/Tracy%E2%80%93Widom_distributi...