Hacker News new | ask | show | jobs
by tomtimtall 1731 days ago
This much more clearly displays the problem with all these “this service does not exist” because even though it doesn’t show the source material it’s abundantly clear that these are just simple stitches of the training toons.

Like you get Elsa with different hair or the up grandpa in a suit.

Once you compare them to the closest examples from the training date it becomes a lot less impressive than the implied “this face is completely out of the imagination of a AI model” turns out the model just imagined someone in the training set with the hair of someone else. Quite boring.

5 comments

> it’s abundantly clear that these are just simple stitches of the training toons.

That's not how GANs work in general. With a limited training set the end result may appear as if they were stitched together samples, but that's not what happens under the hood.

Interpolation videos make it obvious that it encodes visual concepts, can freely manipulate them and even crank up parameters beyond anything found in the training set, thus giving exaggerated results.

Agreed, but I also agree with OP - as a Disney fan I can pretty easily identify some of the character parts this AI has taken them from.

One of them on mine is just Helen from the incredibles head with Elsas face. I couldn’t design a cartoon character from scratch, but I could definitely make a ‘new’ cartoon character if I’m allowed to take Homer Simpson’s head and paste Goofys face on top.

i think the results shown in this paper contradict your assertion:

https://openaccess.thecvf.com/content_ICCV_2019/papers/Abdal...

given an arbitrary face, we can find its embedding in the latent space of the model. this shows that the model has the potential to generalise to real but unseen examples?

on the other hand, i suspect you might be observing a bias in the structuring of the latent space.

thispersondoesnotexist.com likely samples the latent space with a gaussian or uniform distribution, and while the latent space may contain the full spectrum of possibilities, the density of semantically meaningful embeddings may be structured around the distribution of the training set rather than a uniform or gaussian.

i'm stretching my understanding of the topic in trying to convey this.

As others have said, that's not the way a GAN should work. Regurgitating the training set is basically a failure mode that is actively avoided when the models are build and trained.

Looking at these images, and not familiar with how the underlying CG training set is made, I wonder if the original series itself has some comparatively small set of latent features - dimensions you could adjust when drawing the faces - that the model is just learning, so that newly generated faces are effectively the same thing as if one had changed whatever setting you tweak when working with the underlying tool.

I see what you mean but this is definitely not universal about stylegan and it depends on factors such as size of the training set (I'm guessing it was smaller here) and training parameters.
Honestly this seems to be common for GANs in general. Though I don't think most people have looked through CelebA. But if you are lazier, you can scroll through thispersondoesnotexist and you'll find essentially celebrities with similar characteristics to what the OP is saying. More so, you actually see better quality images the closer to a celebrity they look (you see the same thing in the tune version here). I do think ADA is typically worse than the typical StyleGAN2, but that's the tradeoff you get with a smaller sample size (worse because people are training it on smaller datasets so more memorization).
I believe thispersondoesnotexist is also trained on FFHQ not just CelebA though.
How do you know this is the case for all of them?
A good indicator is that when I clicked on the "more..." button, I instantly recognised copied featured (eyes, face shape, nose, hair) from The Incredibles and Frozen in 3 out of the 4 samples just mashed together.

This shouldn't be the case unless you start actively looking for it.

It's just much easier to recognise with these cartoon characters than with realistic faces as there's naturally much less variety in the training material. Also the features are simplified to a point of being easily recognisable as well.