Hacker News new | ask | show | jobs
by naet 932 days ago
It's not a completely imaginary problem or a problem only affecting big corporations. If I'm an individual writer or artist and my work gets fed into an LLM against my will it can seriously undercut the value of that work or discourage me from creating more.

If you can just ask the LLM to give you the contents of my book you are less likely to buy it, and if you can just ask the image generator to generate an image in my unique style for free you won't want to buy my artwork.

I think it makes perfect sense that a model needs a specific license to train on my work, especially if the model is run by a massive corporation making a profit off it, and the model after downloading a copy of my work and "training" can reproduce it verbatim on request.

2 comments

That's not what copyright is for. That's like saying "what if someone reads my book and I don't like them?" That sucks for you, but it's a personal problem unrelated to copyright.
Do you think students should need a specific license to read a book? Do visitors to an art gallery need a specific license to look at paintings? Do audiences need specific licenses to watch a play?

Those people will be influenced by what they've read/seen/heard and their own future writing/drawing/filming/acting/editing/playing might draw inspiration from what they've learned, and they might incorporate things they've learned into their own future work.

Literally every book, song and work of art is "violating copyright" on the thousands of other works that the creator learned from while growing up, if we hold the same standard.

This is a common argument currently, but I think training a LLM is clearly not the same as a student learning. There might be some superficial similarities but they are fundamentally different on many levels (speed, scale, perfect recall, public access, etc). They are held to different standards because they aren't the same thing.

You can listen to a song on the radio or on an internet stream but not have the rights to record and redistribute it (but you do have the right to listen to it at home with multiple people, etc).

An LLM training is closer to "recording and redistributing" than it is to "taking inspiration" or "human learning" in my opinion.