|
|
|
|
|
by ergonaught
10 days ago
|
|
The GPL, unlike the BSD and such, intends to prevent the closing of distributed derivative works. LLMs trained on GPL code can produce derivative works without any enforcement mechanism. You may be fine with that, but the GPL is not a public domain license, and LLM training treats all things as if they were public domain. |
|
This confuses two completely separate things. GPL governs distribution of derivative works. An LLM trained on GPL code does not distribute that code. The model weights are not a copy, a derivative, or a distribution of the training data in any legally recognizable sense; "influenced by" is not "derived from". The enforcement argument is a non sequitur; the GPL has never had a technical enforcement mechanism; it's always been legally enforced after the fact by copyright holders who discover violations. So if the LLM would indeed produce output sufficiently similar to my code and someone would publish it in violation of GPL, I have the same legal means to enforce my rights as if the code was copied by a human.