|
|
|
|
|
by Yreval
2789 days ago
|
|
This is an incorrect description of what the algorithm is doing. Two networks--a "generator" and a "discriminator"--play a minimax game where the generator maps random vectors into images, and the discriminator attempts to distinguish between these images and real images of celebrities. When the discriminator is (close to) optimal, image-gradients based on the features it uses to predict are passed to the generator, so that the generator can learn the distribution in feature space that makes up human faces. > Images of real people have been scored along the adjustable metrics, and then as you click the +/- adjustments, the source images are very craftily "blended" to produce a finite spectrum of results. There is no reason to believe the source images are actually embedded in the latent space. The descriptors that you can edit in the gui don't necessarily span an orthogonal basis in that latent space, so some of them are correlated, which is why editing one value can change others. Additionally, there is no a priori reason to believe that the manifold of "human face-like images" of 628x1024 is 512-dimensional, so there are areas of the space that still don't map well to real images. The network's ability to cover this space is limited by the number of unique training images it sees, how long it is trained, and its architecture. |
|
I think both you and the author of the article are making the same mistake here. (Although at least you use "orthogonal" and "correlated," whereas the author calls nonorthogonal vectors "entangled" for some reason.)
If you have a nonlinear function f on a vector space, there's no reason why an orthogonal basis for that space will give a better parameterization than a nonorthogonal basis. Even if you have a linear function, there's no reason why that should make a difference.
(For example, take f(x,y) = (x-y,y). Then f(x,0)=(x,0) and f(y,y)=(0,y), so "correlated" input directions (1,0) and (1,1) are mapped to "independent" or orthogonal outputs.)
I think it is a bit of a mystery why Gram-Schmidt orthogonalization makes a difference here. Perhaps the author should experiment more with different inner products.