| "The problem is not that Copilot produces code that is "inspired" by GPL code, it's that it spits out GPL code verbatim." But only snippets as far as I can tell. This is the codeexample linked from the author: https://web.archive.org/web/20221017081115/https://nitter.ne... It is still not trivial code, but are there really lot's of different ways on how to transpose matrixes? (Also the input was "sparse matrix transpose, cs_", so his naming convention especially included. So it is questionable if a user would get his code in this shape with a normal prompt) And just slightly changing the code seems trivial, at what point will it be acceptable? I just don't think spending much energy there is really beneficial for anyone. I rather see the potential benefits of AI for open source. I haven't used Copilot, but ChatGPT4 is really helpful generating small chunks of code for me, enabling me to aim higher in my goals. So what's the big harm, if also some proprietary black box gets improved, when also all the open source devs can produce with greater efficency? |
This. People seem to forget that generative AIs don't just spit out copyrighted work at random, of their own accord. You have to prompt them. And if you prompt them in such a way as to strongly hint at a specific copyrighted work you have in mind, shouldn't some of the blame really go to you? After all, it's you who supplied the missing, highly specific input, that made the AI reproduce a work from the training set.
I maintain that, if we want to make comparisons between transformer models (particularly LLMs) and humans, then the AI isn't like an adult human - it's best thought of as having a mentality of a four year old kid. That is, highly trusting, very naive. It will do its best to fulfill what you ask for, because why wouldn't it? At the point of asking, you and your query are its whole world, and it wasn't trained to distrust the user.