Hacker News new | ask | show | jobs
by yorwba 2641 days ago
https://thispersondoesnotexist.com/ only exists because someone already collected enough real photos of real persons to have a neural network learn their distribution well enough to generate new samples.

Harvesting even more photos is not really necessary at this point, and in any case, scraping them off the web would be faster than creating a novelty website.

1 comments

>Harvesting even more photos is not really necessary at this point, and in any case, scraping them off the web would be faster than creating a novelty website.

But scraping them from the web says nothing about the source, even if you manage to remove all stock photos.

This way it is IMHO more likely that it is a "real" photo, most probably uploaded by a "real" user and the site has also the IP of the sender.

Morover, most photos you can find on the web have had their EXIF information removed by the host, maybe it is not the case for a casual user.

As I see it scraping them off the web is good for quantity but not so much for quality, this (completely hypothetical) approach would give less quantity but IMHO better quality data.

I definitely tried it with more junk images that I had in stock than real photos, so, there's that. There's moment where you need to know what face a computer would say resemble the most a sushi... So I'm not 100% sure about quality