Hacker News new | ask | show | jobs
by gwern 2789 days ago
I suggested that to deeppomf a while ago (I was thinking of simply using unlabeled anime images from https://gwern.net/Danbooru2017 with random area deletions to simplify the model & training process as much as possible) and his belief is that because genitals are such a small fraction of any images, and the rest of images vary so much while genitals are a fairly small narrow domain, a generic inpainting/denoising CNN will learn to inpaint pretty much anything else possible and neglect genitals specifically.

Presumably if you trained a really big inpainting CNN a lot, it would learn genitals (along with everything else), but it's understandable that he would try a much more targeted approach.

2 comments

So do you know what exactly the model was trained on? Unless I missed it, there's no training code in the repo, or any other indication of how data was prepared.
I'm not sure. I suggested Danbooru2017, as I mentioned, and I thought he was using it, but double-checking his Reddit comments he seems to imply he's using a custom private dataset only at this point. Maybe he hand-extracted a lot of censored/original pairs from various places.
A neural net that replaces jarring censorship with suspiciously conveniently placed objects? Hilarious...