Hacker News new | ask | show | jobs
by hyperbovine 1289 days ago
Presumably for similar reasons that the vast majority of AI generated art and text is off-puttingly hideous or bland. For every stunning example that gets passed around the internet, thousands of others sucked. Generating art that is aesthetically pleasing to humans seems like the Mt. Everest of AI challenges to me.
4 comments

I think your comment is off-topic to the post you are replyng to. That wasn't asking about the general aesthetic quality - more about a specific audio artifact.

> For every stunning example that gets passed around the internet, thousands of others sucked.

From personal experience this is simply untrue. I don't want to debate it because you seem to have strong feelings about the topic.

Even if you remove the artifact, the exact same comment applies. It generates a somewhat less interesting version of elevator music. This is not to crap on what they did. As I said, they underlying problem is extremely difficult and nobody has managed to solve it.

I don't feel strongly about this topic at all.

> It generates a somewhat less interesting version of elevator music.

This iteration does, but that's an artifact of how it's being generated: small spectograms that mutate without emotional direction (by which I mean we expect things like chord changes and intervals in melodies that we associate with emotional expressions - elevator music also stays in the neutral zone by design).

I expect with some further work, someone could add a layer on top of this that could translate emotional expressions into harmonic and melodic direction for the spectrogram generator. But maybe that would also require more training to get the spectrogram generator to reliably produce results that followed those directions?

The vast majority of human generated art is hideous or bland. Artists throw away bad ideas or sketches that didn’t work all the time. Plus you should see most of the stuff that gets pasted up on the walls at an average middle School.
Hard disagree. The average middle school picture will have certain aspects exaggerated giving you insights into the minds eye of the creator, how they see the world, what details they focus on. There is no such minds eye behind AI art so it's incredibly boring and mundane, no matter how good a filter you apply on top of it's fundamental lack of soul or anything interesting to observe in the picture beyond surface level. It's great for making art for assets for businesses to use, it's almost a perfect match, as they are looking to have no controversial soul to the assets they use, but lots of pretty bubblegum polish.
Perhaps most of the AI art out there (that honestly represents itself as such) is boring and mundane, but after many hours exploring latent space, I assure you that diffusion models can be wielded with creativity and vision.

Prompting is an art and a science in its own right, not to speak of all the ways these tools can be strung together.

In any case, everything is a remix.

I have to agree, the act of coming up with a prompt is one and the same with providing "insights into the minds eye of the creator, how they see the world, what details they focus on" - two people will describe the same scene with completely different prompts.
And the vast majority of professionally produced artwork is for business use. It’s packaging design or illustration or corporate graphics or logos or whatever.

I don’t get the objection.

> For every stunning example that gets passed around the internet, thousands of others sucked

…implying there may be an art to AI art. Hmm.

Meanwhile, the degree to which it is off-puttingly hideous in general can be seen in the popularity of Midjourney — which is to observe millions of folks (of perhaps dubious aesthetic taste) find the results quite pleasing.

Not sure about this. Models like Midjourney seem to put out very consistently good images.