| IANAL but as I understand it, ruling in the US is that machines can not produce "derived works" of copyrighted works. If it replicates (A)GPL code verbatim, it's up to the user to comply with its license. Of course the interesting part is that the user not only has no idea what that license is but also where the code came from and if it is in fact copied verbatim. It's unlikely a court would agree that putting licensed code through a machine strips the licensing requirements of the code, of course, but that doesn't seem to be Microsoft's problem. I think Microsoft's use of public code hosted on GitHub is covered by the terms of service but if this use includes granting a license more permissive than the license indicated on the code itself, this would probably put every GitHub user who ever committed less permissively licensed code to GitHub that they didn't control in violation of those licenses. There's really only three ways this can go: 1) Machine learning does legally become a license-stripping black box, which would allow creating a machine generated commons by feeding arbitrary copyrighted works into sloppy AIs that mostly just replicate their input without changes. 2) Copyright law is extended to consider the output of machine learning as derived works from its inputs, massively extending the reach of copyright and creating massive headaches for everyone (e.g. depending on the exact ruling this would effectively make it impossible to reproduce a digital artwork as merely rendering it on a screen would create a derived work). 3) The original licenses are upheld and remain in effect, rendering the output of Copilot useless by creating a massive legal headache for anyone trying not to violate copyright. I think outcome 2 is unlikely but 1 and 3 aren't mutually exclusive. |