Hacker News new | ask | show | jobs
by renonce 838 days ago
I learned the easiest way to do d-dimension sampling in Foundations of Data Science: see https://news.ycombinator.com/item?id=34575637 or https://www.cs.cornell.edu/jeh/book.pdf?file=book.pdf

I don't think it's a good idea to introduce over 20 different methods before talking about the correct one that works for any number of dimensions, say n, and the reason behind its correctness is very obvious:

* Generate a random vector by sampling n standard normal distributions: `vector = np.random.randn(n)`

* Key step: show that the vector has a uniform direction.

The proof is as follows: you look at the probability density function of a normal distribution, which is `p(x)=1/sqrt(2pi)*exp(-x^2/2)`, and the probability density function of the vector is the product of all these densities of its individual dimensions. Now the product `p(x1)p(x2)...p(xn)=1/sqrt(2pi)^n*exp(-(x1^2+x2^2+...+xn^2)/2)` since `exp(a)exp(b)=exp(a+b)` due to power functions.

Now it's easy to see that probability density is invariant to vector length, which means the vector has uniform probability for any specific value of x1^2+x2^2+...+xn^2. Whatever rotation you apply after sampling this vector, since rotation preserves x1^2+x2^2+...+xn^2 by definition, you get the exact same probability density function and therefore the same distribution of vectors.

* Now that the direction of the vector is uniformly sampled, decide the radius separately: for n-sphere the radius is just 1, and for n-ball the volume of the radius is proportional to the nth power of the radius, so you sample a uniform number from [0,1] as the volume and take the nth root as the radius: `radius = np.random.uniform(0, 1)*(1/n)`

* Normalize the vector to the radius: `vector = vector / np.sqrt((vector*2).sum()) * radius`

I think Section 2 of this book provides a much better perspective on the problem of generating uniform random points, since it also provides intuition behind the geometry of high dimensions and properties of the unit ball, etc.

2 comments

I can't believe I didn't already know this
This is the way.