Hacker News new | ask | show | jobs
by itissid 4 hours ago
Noob Question: There is a famous parlor trick with generative networks(I think it was GANs but it might be some kind of diffusion based network.), you start with a canvas and draw a stick figure of what you want and the generative network draws the rest of it.

Do AI platform companies actually pre-train networks to do the same for hand drawn artists?

Related question: If they do train them to do that, are there any that train people for the "reverse": learn how to draw with paper and pencil by showing techniques only i.e only the "what" but not the "how" ?

3 comments

AI image generators readily take partial images and run in a super-resolution mode(they are always in that mode). They can take a stick figure or a screenshot or anything you want. They prefer to have text description of the image, but that can be generated too if needed.

It seems Western AI platform companies generally don't prefer an architecture with multimodal non-literal inputs to closely follow intents of users, over ones based on pure literal descriptions. It was some Chinese guys that first did works in that direction. There appear to be psychological resistance to the idea of non-literal forms of thoughts among Western entities, as if there's some literal-text superiority theory deep down in people's minds. Others like researchers from Chinese labs probably don't have that.

Artists' responses to generative fill-ins are lukewarm at best, if the obvious responses were put aside. AIs tend to treat artists' intentions as deviation from the mean and tend to steer image into less interesting, more noisy directions. That negates potential productivity gains.

I don't think there's any AI trained to generate ideal strokes from prompts so to teach someone, or datasets that could be used for it, esp. with current climate regarding AI image generation - the bridge between AI and artists of many kinds are burning white hot, nothing is going through there.

re: parlor trick

Are you referring to:

https://github.com/lllyasviel/controlnet

?

They're likely talking about style transfer from a decade ago: https://arxiv.org/abs/1611.07004
Heh, that user’s handle is a reference to the visual novel Fate/Stay Night (which has been adapted into several animes). A cute coincidence.