Hacker News new | ask | show | jobs
by supermatt 1182 days ago
> I asked if I can learn from that code, which obviously I can.

Did you actually read the link you were given? Clean room design is because you may inadvertently plagiarize copyrighted works from your memory of reading it.

i.e. the act of reading may cause accidental infringement when implementing the "things you learn"

2 comments

> i.e. the act of reading may cause accidental infringement when implementing the "things you learn"

Surely you know this isn't the case right? Maybe you're confused because we're talking about programming and not a different creative artform?

Great artists read, watch and consume copyrighted works of art all day, if they didn't they wouldn't be great artists. And yet the content they produce is entirely there own, free from the copyright of the works they learned from.

What's the difference then in programming? Why can an artist be trusted not to reproduce the copyrighted works that they learned from but not the programmer?

Artists get into trouble all the time for producing works very close to something that already exist. That's like the number one reason artists get shunned in the communities they were in.
Every filmmaker watches movies

Every author reads books

Every painter view paintings

Unless you're arguing that every single artist across every field of artistic expression is constantly being jeopardized by claims of copyright infringement, this is a nonsensical point to make.

But they’re not creating similar works, unlike AI which IS. Why is this so complicated for you?
I would seriously question if this happens all the time, these days. The whole copyright thing is way behind the digital and internet revolution. Look at what the Prince case did for transformation copyright fair use.
The process of online artists shaming each other doesn't really have anything to do with the legal system, though they all act like it is.
> Why can an artist be trusted not to reproduce the copyrighted works that they learned from but not the programmer?

They cant. which is why that quote "Good artists copy, great artists steal" exists.

AI has already been shown to be "accidentally" reproducing copyrighted work. You too, can do the same.

Its likely no-one (including yourself) will ever be aware of it - but strictly speaking it would still be copyright infringement. This is the relevance and context of the link you were given.

If everyone is infringing copyright, no one is infringing copyright. This is a dead-end thought.
Sure but the infringement is the problem, not the ideas themselves.

You're describing thought crime right now. It's not illegal to learn things.

And if you "learn" something and accidentally rewrite it verbatim? Thats what clean-room design is to protect against
Rewriting the code verbatim and distributing it would be a copyright infringement, yes, you do not have a write to distribute code written by other people

That's completely different from reading and learning from code, which is what grondo described.

Clean room design relies on this, in a clean room design you have one party read and describe the protected work, and another party implement it. That first party reading the protected work is learning from closed-source IP.

> That's completely different from reading and learning from code, which is what grondo described.

AI (e.g. copilot) has already been shown to break copyright of material in its training set. Thats the context of this whole thread.

Perhaps, but not of Grondo's point.

If an AI infringes on copyright then it infringes on copyright, that's unfortunate for the distributors of that code.

Humans accidentally infringe on copyright sometimes too. It's not a unique problem to machine learning. The potential to infringe on copyright has not made observing/learning/watching/reading copyright materials prohibited for humans, nor should it or (likely) will it become prohibited for machine learning algorithms.

> Perhaps, but not of Grondo's point.

Grondo said that AI should be given access to all code, including private and unlicensed code.

He was given a link to Clean Room Design demonstrating the problem with the same entity (the AI) reading and learning from the existing code and the risk of regurgitation when writing new code.

He goes on to say thats what he does, which doesn't change that fact.

> Humans accidentally infringe on copyright sometimes too.

Indeed we do, and its almost entirely unnoticed, even by the author.

> nor should it or (likely) will it become illegal for machine learning algorithms.

If those machine learning algorithms are taking in unlicensed material and then they later output unlicensed and/or copyrighted material, then they are a liability. Why would you want that when you can train it otherwise and be sure it NEVER infringes others IP? Its a no-brainer, surely. Or are you assuming there is some magic inherent in other peoples private code?