Hacker News new | ask | show | jobs
by 1dry 1410 days ago
Every one of the MidJourney images featured in this article is obviously derived from averaging. The model is trained to correlate the average look of this or that with whatever keywords. For example, the "artist at work" – okay from afar with no real examination/consideration, it looks like something, kind of. As soon as you actually begin to look at it, it becomes glaringly uninspired. Hmm, what's he working on there...is that a hand? what even is going on in front of the artist? How about the front and center desk – nothing is actually definitive, I can't recognize a single thing on that desk! Etc etc, and similar observations can be made about all of the other images.

Not that art should necessarily be a clear depiction of real things, far from that. Good art (read: art I feel a connection with and inspired by) is the presentation of a SUBJECTIVE POINT OF VIEW. In these images, all I see is the averaging of many points of view (the actual artworks the model was trained on, I presume), resulting in no point of view at all.

This technology will take jobs like designing 3-star hotel lobby wallpaper, not the places of human artists. Without a human point of view, "art" like this is flavorless jello.

2 comments

Your concerns regarding averaging and the lack of specifics are both addressed in the article.

In short, it's a mistake to judge AI art by its current capability. It's about to go on an exponential tour.

Well, the author says they keep up on "the latest papers" and whatnot, which leaves me as yet unconvinced of the artistic quality of the future output from these models. We will see.

While I'm still rambling on about this – the author states: “In a strictly mechanical sense, yes, an artist can still create, knowing that the art could be done faster and better by AI.” The art cannot necessarily be done better by an AI, and the speed with which art is created does not matter, artistically.

> In short, it's a mistake to judge AI art by its current capability. It's about to go on an exponential tour.

It's been going on one for the last five years. Two years ago it would have been difficult to generate a picture that's recognizably anything at all, using any non-specific engine.

Now we're down to complaints that mostly don't matter that much. Two years from now..?

So, what matters?
> It's about to go on an exponential tour.

The Singularity is ever near because exponential growth!

As with vehicular autonomy? We all heard predictions like that about 4-5 years ago.
What you're describing is a known limitation of Midjourney and it's default impressionist style. Current solution to that problem: any particular item on the desk can be erased and re-generated with DALL-E, with prompt control, which will add a much clearer item.
That's an interesting idea dogcomplex, I'd be very curious to see the output of that process. Still ... who or what will choose what is too "average-y," what should be replaced with something more clear? Will some larger-scope model be responsible for that? Is there then any way to stop that recursion in an artistic way or does ultimately a human artist need to get involved?