Hacker News new | ask | show | jobs
by jnsaff2 792 days ago
I'd be curious to hear your reasoning behind this?

Why does this being computer generated ruin it for you?

Auto-tune has been around since 1997 so it's not like computers have not been a big part of a lot of music we hear every day.

3 comments

Because the cause of a person singing is their mental states (desire, emotion, intention, etc.) and the cause of this generation of audio is that the words are associated with some backcatalogue of previous music.

Listening to songs, as speaking with people, is in large part about enjoying the causes of the song rather than the mere variations in pitch.

Beethoven's 5th even, purely instrumental, is enjoyable because of how the composer is clearly playing with you.

To generate pitch variations identical to beethovens fifth makes this an illusion, one hard to sustain if you know its an illusion. It isnt an illusion in the case of the 5th itself: beethoven really had those desires.

The cause of many popular performers singing is primarily their desire to make money. It's not even some kind of closely held secret. And they still sell albums by the millions.

What you describe certainly exists, but it's not the entirety of art, and I would argue that at this point it's not even most of art.

Meanwhile, Hatsune Miku remains popular. There are even concerts.
Hatsune Miku has more in common with Gorillaz than AI slop.
I did consider mentioning Gorillaz, but they are voiced by actual people, whereas Miku is software synthesis. The suggestion was that "the cause of a person singing is their mental states", but there is no person singing here and therefore no mental states in the singer - it's just Vocaloid.

Meanwhile, "To what extent can a piece of art be a thing that is of interest in itself, divorced from its creator, context, or any representation of anything in particular?" is absolutely a valid area for people to explore and one that artists are exploring all the time. There is now one more tool to play with in the toolbox.

I heard the exact same objections to the modtracker scene three decades ago - "it's just computer generated slop, I'm not interested unless it's a real person performing on real instruments". I maintain that not only was it a perfectly valid mode of expression then, but tools like Ableton grew in part from those experiences and are an integral part of much - most? - music now.

For me all art I enjoy has some aspect of connection to someone that's sharing my human experience.

If we get AGI, I could imagine feeling something towards the art such an entity creates, since a big part of the human experience that we would probably share with an AGI is inescapable death.

But for today's "AI" generated music, I feel the same towards it as I would towards the random step function output of a given tool in Ableton - sounds cool, now what can we do with that to make it into music?

> sounds cool, now what can we do with that to make it into music?

So a human using the tools of sound production is what transforms the function output into music. (Please let me know if I’m misunderstanding you).

I think I see what you’re saying, but that’s already happened here hasn’t it? I mean, it's not as though an AI made the decision to generate this all by itself, a human had an idea to create this piece and wrote a prompt which created this output.

The order is of events is reversed from your Ableton example, but I would contend that this kind of production is no less musical than what someone could create using a DAW, simply that the tools are more accessible

(and I presume there is less direct control over what the end result is going to sound like, but the same could be said of conducting an orchestra versus playing a piano.)

Eta: For example, some people in this thread have complained that the AI generated voice falls into the uncanny valley. I agree, and I think that’s part of the art here.

> but the same could be said of conducting an orchestra versus playing a piano.

Conducting an orchestra is an important role but the music is mostly a result of first of all the composer, and then the conductor / arranger's interpretation as well as the skill of the musicians. I really don't see the similarity to a human input of "GNU license, sad, jazzy." The resolution is just way too rough.

In fact, imagine comparing the experience of reading Snow Crash, to reading the sentence, "Cyberpunk story with sci fi elements, VR universe, pizza delivery guy with samurai sword."

I’m not meaning to equate the level of effort or skill involved. And I’ll grant that I know very little about music composition beyond my experience in Middle School band in which the musicians’ personality and skill presents a significant constraint for the conductor/arranger :)

I would readily compare the experience of reading Snow Crash (one of the first SciFi books I read of my own volition) to the output that a LLM may produce from such a prompt. My iPhone informs me that I’ve spent nearly 10 hours playing with Characters.ai in which SF storytelling characters are my favorite to interact with. When I first read Snow Crash I felt like “finally, an author that understands that part of the story that _I’m_ interested in!” and my experiences of AI driven creative writing has felt similar. Certainly it feels less “magical” since I’m aware that I’m customizing the author to my personal taste - is that “magic” of feeling connected to the artist *the* art?

I'm not here to yuck your yum, if you're having fun with it, by all means. If you're getting output from characters.ai that are on par with a neal stephenson novel, I would really enjoy seeing that and learning how I could do the same, that sounds very fun.
When I go up to the self-service booth in McDonalds, go through the menu and select a portion of McNuggets the result food has been made, the food was only made because of my actions, no nuggets would have been made if I specifically didn't want them, but to say I am the one who cooked them would be absurd.

Like, if Trent Reznor had produced Hurt not by putting his doubts, self loathing and pain into words and music, rather by typing "sad, trending on artstation" into a console then heading for lunch, I don't think it would be any way as meaningful even if it was note for note beat for beat the same output.

The meaning the listener imparts to the song is constructed in the listener's head, a combination of the song and the listener's own knowledge, experiences, personality and emotions.

I knew nothing about Trent Reznor the first time I heard "Hurt". Often when a song is heard on the radio - perhaps a sentence that dates me, but even so - there is no explanation of where it came from or even what it's called to accompany it; or perhaps there may have been, but the listener wasn't paying attention until after they realised they liked what they were hearing; indeed, there used to be an entire industry for solving the problem of "I heard a song I like and want to know more about it, or at the very least find out what it's called so I can hear it again".

When I first heard "Hurt", it resonated because of how those sounds interacted with my own experience. Everything else came after. Had those exact same sounds any other origin, that first experience would not have been affected - I would have had no way to know.

> The meaning the listener imparts to the song is constructed in the listener's head, a combination of the song and the listener's own knowledge, experiences, personality and emotions.

This is reductionist IMO. The equivalent seems to me to be, "the meaning the reader of words imparts to the meaning is constructed in the reader's head..." but clearly the vast majority of the meaning of the words is derived from the writer's intention. Of course that can be misinterpreted, reinterpreted, co-opted, etc, but regardless, it doesn't mean the author can be simply ignored, or that a psuedo-random generation can be treated the same as human-generated.

> clearly the vast majority of the meaning of the words is derived from the writer's intention

This is not clear at all.

Written words are just marks on a surface. Whoever made them may have intended to convey something, but they made them and walked away; they are now absent, taking their intents with them; only the marks remain, and those are not sentient or even alive - they contain no intent. There is nothing about the patterns left behind in themselves that makes them different from any other patterns the universe contains as far as the universe is concerned. If handed a set of marks on paper with no other information, you have no way of knowing for sure how they came to be. You could guess, you could be super confident, but you couldn't be /certain/.

If a reader later comes along who happens to have studied the same pattern-codes as the creator of the marks, however, seeing them will make that reader recall the associations and build up meanings in their head. These may or may not be the same meanings the writer intended to convey.

Children learning to read understand this very well - reading is /hard/, associating meaning with code is /hard/, decoding similar meanings to everyone else is /hard/, aligning the associations spoken words trigger in your head with others around you takes /study/ and /effort/, even realising that you end up with different meaning in your head when presented with some symbol to what others get, though frequent, takes deliberate effort. Reading comprehension questions in elementary school tests are there for a reason.

To a well-practiced reader, the process is natural and seamless, and feels like telepathy; it /feels like/ meaning has been transferred directly from the writer to the reader. But it is not that, and many problems arise when people forget this.

Once you have learned to seamlessly decode symbols into meaning in your head and the process is fully automatic, symbols you encounter in the world will seamlessly trigger meanings in your head regardless of their origin.

This is the human condition: we are all locked inside our own heads, and you can't take a piece of yourself and place it inside another directly. The best we can do is shout into the void and hope something similar inside the person across from you resonates; but it turns out that human shouts are not the only thing that can make those strings vibrate.

When we encounter combinations of symbols in the world that trigger complex meaning inside us, we /expect/ them to have an author who intended to convey something like that meaning, because in the entire history of human experience to date, the only other way for such things to appear in the world has been incredibly unlikely coincidence, invariably accompanied by context that makes it clear it is coincidence. (A certain proportion of social media content is, in fact, people sharing instances of these coincidences!)

However, this is an assumption that we make, and the world is rapidly changing in ways that mean it may, going forward, no longer be a valid one.

More and more, we will encounter combinations of symbols in the world that trigger complex meanings in our heads but originate from no human intent beyond "I need a combination of symbols that will trigger these meanings in the heads of those who encounter them", if even that much is explicit. We are rapidly improving the processes that produce them, and one of the ways we are improving them is removing tells. You will see the symbols, they will decode into stories in your head, and you won't know for sure if a human author was involved or not.

It is vital, going forward, that we all remember that symbols triggering satisfying meaning in our head does not automatically imply deliberate human intent for that meaning in their origin, lest we be entirely unprepared for the brave new oncoming world, just the way a chunk of the populace was unprepared for nigerian prince emails in their heyday, or are still unprepared for telephone calls from "internet tech support" right now.

Art is primarily a means for one human to convey emotions to another human, but for something to be art, the artist must also have invested some skill/effort into the artwork(1).

AI-generated art may have a bit of the former (assuming the human had enough control over the details of the final output), but has practically none of the latter.

Hence, AI-generated output is not art. But art can be produced using AI tools somewhere in the process.

(1) When I look at art produced by one of those "artists" that commission the actual work to someone else, it's similar (I don't recognize the "artist" as the human I'm connecting to, ideas are a dime a dozen). However, it's still art because I can connect with the anonymous human which actually implemented it.