Hacker News new | ask | show | jobs
by Jevon23 1375 days ago
It's not a nitpick. It might be a nitpick if hands were the only thing it couldn't do. But it struggles with a lot more than just hands.

>the tech scales so well as to make their new goalpost irrelevant in a year.

This just brings me back to my original question. Self-driving cars have been "a year away" for many years now, and now companies are starting to hint that human assistance may be required for the foreseeable future [1]. So, why the confidence that art will be an easy problem to solve with just more scaling, when that approach hasn't eliminated the need for humans in any other domain?

[1]https://www.reuters.com/technology/truly-autonomous-cars-may...

1 comments

I have a suspicion that generative art is going to hit a data wall, also. All of these models are constrained in what patterns they can learn because image captions are not very precise. They can rehash common motifs associated with keywords, but they’re not good at following specific instructions. (“The chair is at the corner of the rug, turned 15 degrees to the left, with the leg nearest the camera aligned with the edge of the fireplace.”) For them to meaningfully improve in this regard, I have to imagine someone will need to locate a trove of a few billion images with exceptionally high quality captions, and well distributed throughout the space of possible image types, subjects, themes, and styles.
I think that details like angle and position will be resolved by using basic sketches as a starting point (we can already make images that sort of conform to layouts as well as prompts), and subdividing the image into assets it then has to stitch together in subsequent steps, and then adjusting lighting/contrast/style as a set of filters in post processing. The wall is lowered quite a bit when you don't insist on doing everything from a single magic prompt

(This will be great from the point of view of art creation; not so great from the point of view of supposedly rendering humans obsolete)

That makes sense. I don’t think that will render humans obsolete; I think it will just increase their productivity and ultimately raise the standard of quality expected. It means artists can explore and iterate on ideas faster than if they had to lay down preliminary artifacts manually. But it doesn’t eliminate the need for authorship: someone still needs to decide what to communicate visually and how to communicate it.