Hacker News new | ask | show | jobs
by imdsm 1807 days ago
Some feedback from someone who now has RSI clearing the screen:

1. Clear to background button

2. Brush sizes

3. Don't regenerate while the mouse is down, wait for mouse up

1 comments

1. Pushing now as an option to the vertical [...] menu (give it 10 - 20 mins) 2. Long press on the brush icon for brush menu. Can set brush size numerically or use [-] [+], also can use keys "[" and "]". We will put preset sizes on the list. 3. Does currently generate on mouse up, but pushing fix to wait 3s before generation.
Awesome. I'll get some lunch and try again. I'd really love to read about /how/ this works if that's something you'd be interested in sharing.
Yeah for sure. I'll write up a blog post, but here is my high-level overview for now. Ollie (ML engineer) from our team will fill in the technical details.

This works by using a “segmentation map”. Deep Learning models are really good at performing per pixel labelling.

You’ve probably seen the Remove Background tools that classify parts of the image either as background or foreground to make an alpha mask. But we don’t have to stop at two classes. We can classify an image with much more detail such as what parts contain an Eye, Ear, Nose, Mouth… etc. Passing an image of a face to a ML classification model can return a so-called “Segmentation Map”, which assigns a unique color to each class of facial feature. This kind of object detection is commonly used in computer vision for manufacturing assembly lines, but we want to use it for art...

Now this is where it gets interesting, we can play this in reverse, give the ML model a segmentation map, and use a GAN to generate the most convincing image of a face that it can (given the dataset it was trained on).

The really cool thing is that this isn’t limited to just faces, you can do this with landscapes, buildings, cars, cats you name it. Anything where you can train a model by classifying the segments or "features" of an image.