Hacker News new | ask | show | jobs
by IgorPartola 1807 days ago
If you view GPL code with your browser would that mean that your browser now has to be GPL as well? In the sense that copilot is not much different than a browser for Stack Overflow with some automation, why would it need to be GPLed? Your own code on the other hand…
5 comments

For sake of discussion, it would be clearer to split copilot code (not derived from GPL'd works) and the actual weights of the neural network at the heart of copilot (derived from GPL'd works via algorithmic means).

For your browser analogy, that would mean that the "browser" is the copilot code, while the weights would be some data derived from GPL'd works, perhaps a screenshot of the browser showing the code.

I'd think that the weights/screenshot in this analogy would have to abide by the GPL license. In a vacuum, I would not think that the copilot code had to be licensed under GPL, but it might be different in this case since the copilot code is necessary to make use of the weights.

But then again, the weights are sitting on some server, so GPL might not apply anyway. Not sure about AGPL and other licenses though. There is likely some illegal incompatibility between licenses in there.

As I understand it the things copilot tries to do is automate the loop of “Google your problem, find a Stack Overflow answer, paste in the code from there into my editor”. In that sense, the burden of whether the license of the code being copy pasted is on the person who answered the SO question and on me. If this literally was what copilot did, nobody would bat an eye that some code it produced was GPL or any other license because it wouldn’t be copilot’s problem.

No let’s substitute a different database of for the code that isn’t SO. It doesn’t really matter if that database is a literal RDBMS, a giant git repo or is encoded as a neural net. All copilot is going to do is perform a search in that database, find a result and paste it in. The burden of licensing is still on me to not use GPL code and possibly on the person hosting the database.

The gotcha here is that copilot’s database is a neural network. If you take GPL code and feed it as training data to a neural network to create essentially a lookup table along with non-GPL code did you just create a derived work? It is unclear to me whether you did or not. In particular, can they neural network itself be considered “source code”?

> If you view GPL code with your browser would that mean that your browser now has to be GPL as well?

Some good responses in sibling comments already, but I don't see the narrow answer here, which is: No, because no distribution of the browser took place.

If you created a weird version of the browser in which a specific URL is hardcoded to show the GPL'd code instead of the result of an HTTP request, and you then distributed that browser to others, then I believe that yes, you'd have to do so under the GPL. (You might get away with it under fair use if the amount of GPL'd code is small, etc.)

If you use your browser to copy some GPL code into your project your project must now be GPL as well.

So following your own argument, even if Copilot is allowed, using it still risks you falling under GPL

My point exactly. Copilot is innocent in that case just like the browser.
Or if you simply read GPL code and learn something from it - or bits of the code are retained verbatim in your memory, are you (as a person) now GPL'd? Obviously not.
That probably depends on how large and how significant the bits you remember are. Otherwise one could take a person with photographic memory and circumvent all GPL licenses easily, by making that person type what they remember.
> Or if you simply read GPL code and learn something from it - or bits of the code are retained verbatim in your memory, are you (as a person) now GPL'd? Obviously not.

I do not find that to be obvious at all.

You do not find it obvious that a human being would not become a GPL'd work?
To build a browser you don't need a verbatim GPL code, so it's not a derivative work in the same sense copilot is.

Stackoverflow on the other hand is much trickier question...

SO clearly doesn’t need GPL code to be useful. The wider SE network is evidence of that.