|
|
|
|
|
by reorder9695
49 days ago
|
|
The whole thing with GPL code seems like a mess and surely couldn't be set as actual precedent, right? It is totally infeasible for me to check every single GPL project on every code hosting platform to see if the code Claude etc produced is too similar. If a set of training data used for the model was released to check against that would be one thing, but you can't honestly expect someone to check every repo available from all time to see if a model (that you are not informed of what it was trained on and therefore could reproduce) might've reproduced code from it. That's not at all like checking the dependency chain of a dependency or anything as you can just read the licence of anything you're choosing to use. Surely the precedent would have to be that a model trained on GPL code has itself been infected by GPL, and therefore must have all source/weights released too if the assumption here is that it can have embedded the code well enough to be able to reproduce it? |
|
I don't see how this follows, unless we also agree that humans who have ever read any GPL code are themselves permanently tainted and therefore cannot produce anything that isn't influenced even slightly by said code.
Is it just because we think the robot does a better job at learning than we do? It's an impossible line to draw, I agree, but I don't agree that the answer is "well then everything must be considered tainted," I say the answer is "ignore a vestigial concern of a bygone era."