|
|
|
|
|
by CapsAdmin
1142 days ago
|
|
Your polaroid example would require someone to write code that does that one specific thing. You could also argue that this would violate copyright if it was trained on some photographer's specific unique style, made as an app and marketed as being able to mimic the photographer's style. But in your example you have 1000 random polaroid images of unknown origin, so somehow it becomes abstract enough that it doesn't become an issue. In your stephen king example I would say it's still learned, because the "code" is a general language model that can learn anything. It's just you decided to only train it on stephen king novels. If you have an image model that trained 100% on public domain images and finetune it to replicate a specific artist's style I would personally think the finetuned model and its creator is maybe violating copyright. But when it comes to learning I would say when you write a program whose purpose is to learn the next word or pixel, but it's up to the computer to figure out how to do that, the computer is learning when you feed it input data. It's the program's job to figure out the best way to predict, not the programmer. (it's not that black and white given that the programmer will also sometimes guide the program, but you get the idea) When you write a program that does one or several things, it's not learning. I think it's something to do with the difference between emergent behavior from simple rules and intentional behavior from complex rules. |
|
If I created a program to read words from the input and assign weights based on previous words, I could feed in any data. Just like the polaroid example. (I suggested that the polaroid example was abstract enough not to be an ethical/legal problem because I believe it is mostly transformative, unless the colours themselves were copyrighted or a distinct enough work in themselves.)
Now If I only feed in Stephen King books and let it run, suddenly it outputs phrases, wording, place names, character names, adjectives all from Stephen King's repertoire. Is this a 'general language model'? Should this by copyright exempt? I don't think this is transformative enough at all. I've just mangled copyrighted works together, probably not enough to stand-up against a copyright claim.
I think people use AI and ML as buzzwords to try and obfuscate what's actually happening. If we were talking about AI and ML that doesn't need training on any licensed or copyrighted work (including 'public domain') then we can have a different conversation, but at the moment it's obscured copyright theft.