Hacker News new | ask | show | jobs
by throwaway1851 1377 days ago
I have a suspicion that generative art is going to hit a data wall, also. All of these models are constrained in what patterns they can learn because image captions are not very precise. They can rehash common motifs associated with keywords, but they’re not good at following specific instructions. (“The chair is at the corner of the rug, turned 15 degrees to the left, with the leg nearest the camera aligned with the edge of the fireplace.”) For them to meaningfully improve in this regard, I have to imagine someone will need to locate a trove of a few billion images with exceptionally high quality captions, and well distributed throughout the space of possible image types, subjects, themes, and styles.
1 comments

I think that details like angle and position will be resolved by using basic sketches as a starting point (we can already make images that sort of conform to layouts as well as prompts), and subdividing the image into assets it then has to stitch together in subsequent steps, and then adjusting lighting/contrast/style as a set of filters in post processing. The wall is lowered quite a bit when you don't insist on doing everything from a single magic prompt

(This will be great from the point of view of art creation; not so great from the point of view of supposedly rendering humans obsolete)

That makes sense. I don’t think that will render humans obsolete; I think it will just increase their productivity and ultimately raise the standard of quality expected. It means artists can explore and iterate on ideas faster than if they had to lay down preliminary artifacts manually. But it doesn’t eliminate the need for authorship: someone still needs to decide what to communicate visually and how to communicate it.