Yes, AI should be trained on every piece of information possible. Am I allowed to become a better programmer by looking at private, (illegally leaked) closed-source, proprietary code?
One motivation for artists to create and share new work is the expectation that most people won't just outright copy their work, based on the social norm that stealing is dishonorable. This social norm comes with some level of legal protection, but it largely depends on a common expectation of what is considered stealing or not.
Once we have adopted the attitude that we can just copy as we please without attribution, it would be much more difficult to find motivated artists, and we would have failed as a society.
I didn't ask if can I use other people's proprietary closed source code, obviously they have the right to that code and how it's used.
I asked if I can learn from that code, which obviously I can. There is no license that says "You cannot learn from this code and take the things you learn to become a better programmer".
That's exactly what I do and it's exactly what AI do.
> I asked if I can learn from that code, which obviously I can.
Did you actually read the link you were given? Clean room design is because you may inadvertently plagiarize copyrighted works from your memory of reading it.
i.e. the act of reading may cause accidental infringement when implementing the "things you learn"
> i.e. the act of reading may cause accidental infringement when implementing the "things you learn"
Surely you know this isn't the case right? Maybe you're confused because we're talking about programming and not a different creative artform?
Great artists read, watch and consume copyrighted works of art all day, if they didn't they wouldn't be great artists. And yet the content they produce is entirely there own, free from the copyright of the works they learned from.
What's the difference then in programming? Why can an artist be trusted not to reproduce the copyrighted works that they learned from but not the programmer?
Artists get into trouble all the time for producing works very close to something that already exist. That's like the number one reason artists get shunned in the communities they were in.
Unless you're arguing that every single artist across every field of artistic expression is constantly being jeopardized by claims of copyright infringement, this is a nonsensical point to make.
I would seriously question if this happens all the time, these days. The whole copyright thing is way behind the digital and internet revolution. Look at what the Prince case did for transformation copyright fair use.
> Why can an artist be trusted not to reproduce the copyrighted works that they learned from but not the programmer?
They cant. which is why that quote "Good artists copy, great artists steal" exists.
AI has already been shown to be "accidentally" reproducing copyrighted work. You too, can do the same.
Its likely no-one (including yourself) will ever be aware of it - but strictly speaking it would still be copyright infringement. This is the relevance and context of the link you were given.
Rewriting the code verbatim and distributing it would be a copyright infringement, yes, you do not have a write to distribute code written by other people
That's completely different from reading and learning from code, which is what grondo described.
Clean room design relies on this, in a clean room design you have one party read and describe the protected work, and another party implement it. That first party reading the protected work is learning from closed-source IP.
If you study a closed source compiler (or whatever) in order to write a competitive product, and the company who wrote the original product sues you for copying it, as the parent suggests, you're on shaky legal ground. Which is why clean room design is a thing.
A clean room design ensures the new code is 100% original, and not a copy of the base code. That is why it is legally preferable, because it is easy to prove certain facts in court.
But fundamentally the problem is copyright, the copying of existing IP, not knowledge. grondo4 is completely correct that there is no legal framework that prevents learning from closed-source IP.
If such a framework existed, clean room design would not work. The initial spec-writers in a clean room design are reading the protected work.
>The initial spec-writers in a clean room design are reading the closed-source work.
Right. And they're only exposing elements presumably not covered by copyright to the developers writing the code. (Of course, this assumes they had legitimate access to the code in the first place.)
Clean room design isn't a requirement in the case of, say, writing a BIOS which may have been when this first came up. But it's a lot easier to defend against a copyright claim when it's documented that the people who wrote the code never saw the original.
Unlike with patents, independent creation isn't a copyright violation.
I don't understand what your point here is. The initial spec-writers learned from the original code. This is not illegal, we seem to be agreed on this point. grondo made the point that learning from code should not be prohibited.
Yes you are allowed to read closed-source, proprietary code and become a better programmer for it.
I've decompiled games to learn how they structure their code to improve the structure of games that I program. I had no right to that code and I used it to become a better programmer just like AI do.
That's not copyright infringement. You have a right to stop me from using your code, not learning from it.
This is a pretty extreme stance. There is a fine line between "learning from" proprietary code and outright stealing some of the key insights and IP. Sometimes it takes a very difficult conceptual leap to solve some of the more difficult computer science and math problems. "Learning" (aka stealing) someone's solution is very problematic and will get you sued if you are not careful.
If you think that's extreme, wait until you hear my stance that code shouldn't be something that you can own (and can therefore "steal") to begin with.
Now granted most EULAs and Terms of Service documents aren't legally enforced, most software licenses explicitly prohibit decompiling or otherwise disassembling binaries.
So, yes: They have a right to stop you from "learning" from their code. If you want that right, see if they're willing to sell that right to you.
> They have a right to stop you from "learning" from their code.
They absolutely do not, and as pedantic as it may be I think it's very important that you and everyone else in this thread know what their rights are.
If you sign a contract / EULA that says you cannot decompile someone's code than yes you are liable for any damages promised in that contract for violating it.
But who says that I ever signed a EULA for the games I decompiled? Who says I didn't find a copy on a hard drive I bought at a yard sale or someone sent me the decompiled binary themselves?
Those people may have violated the contract but I did not.
There is no law preventing you from learning from code, art, film or any other copyrighted media. Nor is there any law (or should there be any law IMO) that stops an AI from learning from copyrighted media.
Learning from each other regardless of intellectual property law is how the human race advances itself. The fact that we've managed to that automate human progress is incredible, and it's very good that our laws are the way they are that we can allow that to happen.
Once we have adopted the attitude that we can just copy as we please without attribution, it would be much more difficult to find motivated artists, and we would have failed as a society.