Hacker News new | ask | show | jobs
by jhugo 1462 days ago
I would agree with you if the AI was genuinely assisting with that task, but it isn't.

It's taking inputs, ignoring their licenses, permuting them in ways that are not understandable to the user, and then outputting them.

That's an entirely different task than the user reading SO or using Google and then writing their own code, because the "AI" is not capable of writing its own code at that level.

Relying on this tool means ignoring the license of code that you're copying, without even knowing that you're doing it.

1 comments

> That's an entirely different task than the user reading SO or using Google and then writing their own code, because the "AI" is not capable of writing its own code at that level.

I would say it's a very similar task. If I need to remember how to use a certain function, I can Google for documentation and examples, or I can tell Copilot what I want to do. The fact that the solution was presented by Copilot or a SO thread is, in my view, irrelevant. And to compound on that, I doubt anyone checking SO truly knows where that answer came from. The person could simply be reproducing a snippet from somebody else, you have no way of knowing if it was licensed.

I don't think this is bad either. Even our current shitty copyright laws protect that kind of use. I shouldn't have to worry whether my little prime number generator uses an algorithm first created by John Carmack or Microsoft. Programming has evolved rapidly in great part because we can all use other people's work and use it to improve ours. Of course you shouldn't just copy and paste everything and call it a day, but that's hardly what Copilot enables anyway.

You really seem to be ignoring the core issue by focusing on SO though. Everything on SO is fair game, but code on GitHub is under a variety of licenses, and when Copilot regurgitates it, no matter how complex and inscrutable the process is that leads it to do so, it may be causing the user of Copilot to misuse that code because it doesn't even give them the opportunity to know where it came from or what license it was released to the public under.
Again, how does that differ from Stack Overflow? Do you go and check whether a given reply belongs to a licensed project?

Also, please consider that there is a toggle that allows you to block Copilot from using public code.

> Do you go and check whether a given reply belongs to a licensed project?

All SO questions, answers and comments are CC BY-SA. The terms of the site say that anyone submitting this content agrees that it's licensed that way, and when you visit the site you agree that you are provided with the content under that license. It's not necessary for you to check whether the submitter had the right to offer it under that license; that's their problem. The same goes for any content offered to you under a given license on any platform. I don't understand what your question has to do with the conversation.

The problem with Copilot, and I really can't believe this has to be restated over and over again, is that it takes code from projects with various licenses, and outputs it in your editor in various transformed-or-not-transformed ways (the fact that the transformation is extremely complex doesn't change anything), and gives you no way to know where the code came from, how it was licensed or how it has been transformed. So, despite the fact that if you use it enough you are virtually guaranteed to use code in contravention of its license, you cannot even know which projects you have stolen code from or which licenses' terms you are breaking.

> Also, please consider that there is a toggle that allows you to block Copilot from using public code.

Great. I'm sure its utility doesn't go down at all if you turn that toggle off...

> All SO questions, answers and comments are CC BY-SA. The terms of the site say that anyone submitting this content agrees that it's licensed that way, and when you visit the site you agree that you are provided with the content under that license.

Have you ever read GitHub's conditions to know whether they also have the right to use your code that way, no matter how you decide to license it? I feel that you are overly focused on the legal part here, which I'm sure was handled by Microsoft's lawyers. I'm more interested in the question of principle.

No matter what the terms of use at SO say, anyone can give you an answer that is a copy of some code they don't own. You may consider that immoral, but I don't, not at the scope SO is used for. In addition, the vast majority of cases at SO and Copilot are not about complex functions being stolen, it's about some dumb code you would have found in 2 minutes of googling. What I'm trying to argue here is that if we are all cool with SO and think it's useful, there is no fundamental difference here. We never cared too much about licenses for boilerplate code, and I think we shouldn't start now.

> Have you ever read GitHub's conditions to know whether they also have the right to use your code that way, no matter how you decide to license it? I feel that you are overly focused on the legal part here, which I'm sure was handled by Microsoft's lawyers. I'm more interested in the question of principle.

I have, and there is not. Neither could there be — in many cases the person uploading code to GitHub is not the copyright holder — they are just doing something permitted under the license — and for a large open source project there could be thousands of copyright holders. A random person mirroring some source code to GitHub is in no position to negotiate different license terms on behalf of the copyright holder(s).

> No matter what the terms of use at SO say, anyone can give you an answer that is a copy of some code they don't own. You may consider that immoral, but I don't, not at the scope SO is used for. In addition, the vast majority of cases at SO and Copilot are not about complex functions being stolen, it's about some dumb code you would have found in 2 minutes of googling. What I'm trying to argue here is that if we are all cool with SO and think it's useful, there is no fundamental difference here. We never cared too much about licenses for boilerplate code, and I think we shouldn't start now.

I don't understand why you think a person writing an answer on SO and a computer program outputting some permutation of its inputs into your editor are the same thing. The person writing an SO answer is intelligent and capable of conceptual understanding, the computer regurgitating code without regard to its license is not.