| HN Mirror

AI image generators readily take partial images and run in a super-resolution mode(they are always in that mode). They can take a stick figure or a screenshot or anything you want. They prefer to have text description of the image, but that can be generated too if needed.

It seems Western AI platform companies generally don't prefer an architecture with multimodal non-literal inputs to closely follow intents of users, over ones based on pure literal descriptions. It was some Chinese guys that first did works in that direction. There appear to be psychological resistance to the idea of non-literal forms of thoughts among Western entities, as if there's some literal-text superiority theory deep down in people's minds. Others like researchers from Chinese labs probably don't have that.

Artists' responses to generative fill-ins are lukewarm at best, if the obvious responses were put aside. AIs tend to treat artists' intentions as deviation from the mean and tend to steer image into less interesting, more noisy directions. That negates potential productivity gains.

I don't think there's any AI trained to generate ideal strokes from prompts so to teach someone, or datasets that could be used for it, esp. with current climate regarding AI image generation - the bridge between AI and artists of many kinds are burning white hot, nothing is going through there.