|
If the training set contains verbatim (A)GPL code does this mean that Copilot also should be distributed by Microsoft under GPL? Because without it Copilot (as it is distributed by Microsoft) couldn't be built, wouldn't it make it a derivative work of GPL'd code (and obviously every other license)? I see a lot of people comparing human learning to machine learning in the comments, but there is a huge difference - we don't distribute copies of humans |
By comparison, Copilot is even more obviously fair use.
I've had this conversation quite a few times lately, and the non-obvious thing for many developers is that fair use is an exception to copyright itself.
A license is a grant of permission (with some terms) to use a copyrighted work.
This snippet from the Linux kernel doesn't make my comment here or the website Hacker News a GPL derivative work:
This snippet from an AGPL licensed project, Bitwarden, does not compel dang or pg to release the Hacker News source code: Fair use is an exception to copyright itself. A license cannot remove your right to fair use.The Free Software Foundation agrees (https://www.gnu.org/licenses/gpl-faq.en.html#GPLFairUse)
> Yes, you do. “Fair use” is use that is allowed without any special permission. Since you don't need the developers' permission for such use, you can do it regardless of what the developers said about it—in the license or elsewhere, whether that license be the GNU GPL or any other free software license.
> Note, however, that there is no world-wide principle of fair use; what kinds of use are considered “fair” varies from country to country.
(And even this verbatim copying from FSF.org for the purpose of education is... Fair use!)