But legally, they can't provide such a license. So GitHub can't have that license, surely, because they never had the legal authority to bestow it upon Github.
Is it that widely scoped? Can't we narrow it to "A third party who finds their GPL code on Github but has not uploaded that specific code to Github themselves has a right of action limited to that specific code."
Just because I created a github account once and agreed to the TOS doesn't mean that I agree to let others upload my code to github, where would that scope end. Could someone steal code off my computer which i've never published and put it on Github and that was OK because I once signed up for a github account, clearly a contrived example but.
I'm not sure that someone who published their work under the GPL hasn't thereby given consumers the right to put the repo on github. If the rights Github asks for in their ToS can be construed as a subset of the rights granted by the GPL, Github is just another GPL licensee. Unless they violate the conditions of the license, they're just utilizing their GPL rights.
> Github is just another GPL licensee. Unless they violate the conditions of the license, they're just utilizing their GPL rights.
And here is exactly the problem.
GitHub seems to be copying copyrighted code left and right and pretend they made it!
No attribution, no license.
They are of course allowed to let their AI study the code, but as "employer" of that AI GitHub/Microsoft has a responsibility if that AI breaks copyright right and left and they as a company pretend the code is theirs to give away.
But the thing to note is that a user can have a right to distribute (as with GPL) but does not necessarily have the rights to the license.
So if the user uploads the source to GitHub, they agree to the terms (which they may not actually have the rights to) but that isn't equivalent to the rights owner giving GitHub the rights to distribute the source under a different license.
The TOS can only modify those distribution terms (if it even can be found to be legally binding) if the user uploading the source is the rights owner which in so many cases is not the case.
I think the bigger question is whether GitHub will be able to honor DMCA requests that pertain to copyrighted materials showing up in Copilot's suggestions.
Are we entering in to a new realm where a DMCA (or DMCA-like) request can be filed to remove content from the training data for an AI (and likely cause it to require retraining)?