Hacker News new | ask | show | jobs
by supermatt 1180 days ago
> Perhaps, but not of Grondo's point.

Grondo said that AI should be given access to all code, including private and unlicensed code.

He was given a link to Clean Room Design demonstrating the problem with the same entity (the AI) reading and learning from the existing code and the risk of regurgitation when writing new code.

He goes on to say thats what he does, which doesn't change that fact.

> Humans accidentally infringe on copyright sometimes too.

Indeed we do, and its almost entirely unnoticed, even by the author.

> nor should it or (likely) will it become illegal for machine learning algorithms.

If those machine learning algorithms are taking in unlicensed material and then they later output unlicensed and/or copyrighted material, then they are a liability. Why would you want that when you can train it otherwise and be sure it NEVER infringes others IP? Its a no-brainer, surely. Or are you assuming there is some magic inherent in other peoples private code?

1 comments

> If those machine learning algorithms are taking in unlicensed material and then they later output unlicensed and/or copyrighted material, then they are a liability. Why would you want that when you can train it otherwise and be sure it NEVER infringes others IP?

Because it could produce a better model that produces better code.

You're now arguing a heavily reduced point. That a model that trained on proprietary code is at higher risk of reproducing infringing code is not a point under contention. The clean room serves the same purpose, it is a risk mitigation strategy.

Risk mitigation is a choice, left up to individuals. Maybe you use a clean room design, maybe you don't. Maybe you use a model trained on closed-source IP, maybe you don't. There are risks associated with these choices, but that is up to individuals to make.

The choice to observe closed source IP and learn from it shouldn't be prohibited just because some won't want to assume that risk.