Hacker News new | ask | show | jobs
by Natsu 1016 days ago
My problem with this is that artists learn by studying other artists, cutting that off because it's AI rather than focusing on whether the resulting work is derivative, seems more of a problem to me. It seems to me that an AI can be used for either original work or derivatives, proving that you can get derivatives out of it has always struck me as no different than commissioning a copy of someone's work from a human artist and being shocked that you got what you asked for.
2 comments

Can an AI express to you how van gogh affected it as an artist? I'm not sure that AI is "learning" the way we say humans are "learning," when humans learn and study art. Obviously there is no debate that you can input van gogh into a model and produce something van gogh-like as a result. But I've not seen anything that indicates that the AI is learning anything about van gogh at all. Perhaps it comes down to whether you think learning van gogh is just creating a mapping of all of his brush strokes ever, and only exactly what they look like. It's obvious the AI knows nothing more than that. If you think that's what humans do when they learn art, I'd be sad for you!

As to your hypothetical, we don't give copyrights to people who make rote copies of things, human or otherwise. Is the implication of the shock, that there is sufficient difference with the work as to render it a derivative and not a copy? Okay, how so? And of what consequence? Making derivatives of a copyright without license is infringement.

I think it's learning styles in a way that's at least partially analogous, because it comes out with things that are reasonably original and not in the training data.

I'm sure an LLM can write you an essay like that for any artist you want, but I'm not all that convinced those are meaningful even with humans.

> As to your hypothetical

That's the thing, it's not a hypothetical, it's a past story from here on HN. Someone did that, asking for copies of a famous painting (Girl with a Pearl Earring) and got highly derivative items out of the model and we had a debate over whether that even means anything, because that's both a simple description of the painting and the name of a famous work, so it makes it so it can be ambiguous whether you asked for "Girl with a Pearl Earring" or a girl with a pearl earring in the prompting.

I agree that it looks like copyright infringement whether it's done by a human or AI, though. I guess a lot of people missed the prior discussion on HN.

>I think it's learning styles in a way that's at least partially analogous, because it comes out with things that are reasonably original and not in the training data.

I don't think that is evidence that what it is doing is "learning".

>I'm sure an LLM can write you an essay like that for any artist you want, but I'm not all that convinced those are meaningful even with humans.

Well, it wouldn't be reflective of what the LLM thinks, so what is your point? If you are of the belief that humans don't have thoughts, I guess it's not a surprise you view things this way.

>That's the thing, it's not a hypothetical, it's a past story from here on HN. Someone did that, asking for copies of a famous painting (Girl with a Pearl Earring) and got highly derivative items out of the model and we had a debate over whether that even means anything, because that's both a simple description of the painting and the name of a famous work, so it makes it so it can be ambiguous whether you asked for "Girl with a Pearl Earring" or a girl with a pearl earring in the prompting.

You say derivative but without any reference to what it actually means... what about is derivative - that's the analysis that's happening in court. The analysis isn't "what you asked the LLM" because that's not dispositive to whether or not something is a copy.

>I agree that it looks like copyright infringement whether it's done by a human or AI, though. I guess a lot of people missed the prior discussion on HN.

Sorry I don't read every single thread about copyright on HN? This is the second posting I've seen on the RFC today. Give me a break!

> I don't think that is evidence that what it is doing is "learning".

When I say learning I mean something like "gaining new ability by studying how others did the same task, resulting in being able to produce novel output." I'm not quite sure what you are using the word to mean here, though I might agree that there are differences between what AIs do and what humans do, the question being what they are and whether they're important here.

I don't claim to know anything about the internal experience (if any) of an LLM writing such an essay and I can't really reason about that because I've never been an LLM, whereas I can at least relate to human experience. I think your assertion that it "wouldn't be reflective of what the LLM thinks" is a bit like saying that you don't think submarines are actually "swimming," as the saying goes, though. It may not "think" in human terms as we do, but it's certainly doing some kind of calculation that produces an equivalent output, so I have a lot of questions about whether we can say that on principle. We're well past passing the Turing test for a lot of things, either the original or censored form, these questions are getting less academic by the day.

> You say derivative but without any reference to what it actually means

We're talking about copyright law, so the meaning of derivative was borrowed from that, i.e. that AI model was producing works that could be reasonably thought to have infringed on the copyright of that painting when prompted for "a girl with a pearl earring" and this was held up to mean that AIs are just regurgitating training data and are therefore implicitly missing something essential to being an artist or what have you and all their work should be considered derivative works of the training data as far as copyright law is concerned.

Meanwhile, I'm saying that I think the AI should be judged about like a human artist would be to argue against the people who seem to want to say that the AI can't take input from copyrighted things without all of its output being tainted forever. We have no such requirement for humans and I don't see why it makes sense to add this new restriction on AIs specifically.

> Sorry I don't read every single thread about copyright on HN?

I'm not faulting you for not knowing, I'm faulting myself for assuming too much context and just trying to explain what I had in my head when writing that so you could understand how I came to think that. Hopefully this lets you see where I'm coming from.

>When I say learning I mean something like "gaining new ability by studying how others did the same task, resulting in being able to produce novel output." I'm not quite sure what you are using the word to mean here, though I might agree that there are differences between what AIs do and what humans do, the question being what they are and whether they're important here.

I think the dictionary definition is more than sufficient: "the acquisition of knowledge or skills through experience, study, or by being taught." This is what I mean by running with your own made up definition.

>I don't claim to know anything about the internal experience (if any) of an LLM writing such an essay and I can't really reason about that because I've never been an LLM, whereas I can at least relate to human experience. I think your assertion that it "wouldn't be reflective of what the LLM thinks" is a bit like saying that you don't think submarines are actually "swimming," as the saying goes, though. It may not "think" in human terms as we do, but it's certainly doing some kind of calculation that produces an equivalent output, so I have a lot of questions about whether we can say that on principle. We're well past passing the Turing test for a lot of things, either the original or censored form, these questions are getting less academic by the day.

You are the one redefining words like "think" and "experience" not me. I'm not playing that game at all. After all, you are the one that is equivocating these processes between humans and AI by coming up with your own, much more broad concoctions.

>We're talking about copyright law, so the meaning of derivative was borrowed from that, i.e. that AI model was producing works that could be reasonably thought to have infringed on the copyright of that painting when prompted for "a girl with a pearl earring" and this was held up to mean that AIs are just regurgitating training data and are therefore implicitly missing something essential to being an artist or what have you and all their work should be considered derivative works of the training data as far as copyright law is concerned.

I'm familiar with copyright law, I'm not sure you are. A work can be derivative in a number of ways, some are legal, some aren't. It's not a new thing that some uses by a machine can be infringing, and others, non-infringing. Why now must it be that machines should be analyzed the same as humans all of the sudden?

>Meanwhile, I'm saying that I think the AI should be judged about like a human artist would be to argue against the people who seem to want to say that the AI can't take input from copyrighted things without all of its output being tainted forever. We have no such requirement for humans and I don't see why it makes sense to add this new restriction on AIs specifically.

Yes, I understand that. But I asked why it should be judged as a human, and you are saying because it "learns". But that's only based upon your re-defining the concept of learning in order to make it inhuman. The only reasonable arguments I've seen that AI outputs should be copyrightable are based on them being a tool that an artist can use. What you are saying is just dressed up anthropomorphization.

> I think the dictionary definition is more than sufficient: "the acquisition of knowledge or skills through experience, study, or by being taught." This is what I mean by running with your own made up definition.

I mean, if a human looked at a bunch of art, essays, etc. and then was able to produce similar works, we'd normally consider that "learning." What word would you use for being able to reproduce Picasso (or whomever) by looking at a bunch of examples?

Also I don't think I have defined "think" or "experience" at all. But I'd point out that I don't see anything like a principled boundary around them or that we can point to something that humans do that AIs don't or can't do. It seems to fall back on something that looks like qualia or subjective internal experience and philosophy hasn't resolved that with respect to other humans... except by analogy. "I think the other humans are like me and I have subjective internal experience, so they probably have it to, rather than being p-zombies."

If you have a better answer to that, feel free to tell me, it'd be interesting.

> It's not a new thing that some uses by a machine can be infringing, and others, non-infringing. Why now must it be that machines should be analyzed the same as humans all of the sudden?

Sure, I'll agree that it's not even necessary to consider the works transformative or whatever.

FWIW, I don't think that AIs should be getting their own copyrights or anything like that, I'm just saying that the training data shouldn't forever taint the output no matter what's produced.

You can ask someone to produce a pin-up version of Minnie Mouse, but good luck using it in any commercial activities.

Most LLMs are just profiteering from people’s labor without their consent. And there’s nothing new being produced. It’s always a statistical output of previous works.

> You can ask someone to produce a pin-up version of Minnie Mouse, but good luck using it in any commercial activities.

The same would automatically apply to LLM output -- there's no need to change the current laws to cover that case.

The question is this. Suppose I ask a human artist and an LLM to create me a new female mouse cartoon character. And suppose both the artist and the LLM have been exposed to Minnie Mouse. It's not unlikely that the new character created in both cases will have aspects specifically similar to, or specifically opposite to Minnie Mouse.

In the case of the human artist, the new character will not be covered by Disney's copyright, unless there was a lot of copying. Why should the result be different for LLMs?

The logical conclusion of "any output of an LLM that's seen Minnie Mouse must be subject to Disney's copyright" is "any output of any human that's seen Minnie Mouse must be owned by Disney". Which I'm sure Disney would love, but would certainly make the world a worse place for everyone.

> a pin-up version of Minnie Mouse

that's not because of copyright, but because of trademark. If you make the minnie mouse sufficiently different that it cannot be mistaken for not being Minnie to the average person, and don't call it minnie mouse (to get rid of trademark), disney will have a much harder time suing you. Of course, they will still try, and steam roll you with just money instead.

> And there’s nothing new being produced. It’s always a statistical output of previous works.

I don't think you can define those terms such that what you say is true of AI but not true of people.

I think you're misunderstanding that, I don't expect it in either case, I'm saying you have to judge the output not the input. So even if it trained on a ton of copyrighted artwork, if the output isn't a ripoff of something in the training data, I don't think there should be any copyright issues.