Hacker News new | ask | show | jobs
by zaptheimpaler 1797 days ago
Basically you are saying any product thats free should be able to break the law arbitrarily? Google can decide to post your all email online, your location history for all time, your photos and you would be ok with that because its free? The undoubtedly free DNS server you use can leak all your requests too? You're ok with that?

Yes its free, no they did not tell me or ask for my permission before using the code in this way. Free does not give them the right to break existing laws and licenses. Its pretty simple.

2 comments

However you did give them permission to use your code by the fact that you acceded to their terms and conditions[1] when you created an account. IANAL, I don't know if this section would hold to scrutiny in a court of law, but I'm pretty sure this is what their legal team considers to cover them when it comes to training Copilot on code hosted with them.

[1] https://docs.github.com/en/github/site-policy/github-terms-o...

People routinely share code on Github that is not owned or at least not fully owned by them, so they can't really rely only on the ToS.
See one paragraph above what I initially linked. They cover that also.
IANAL, but that is not my reading. They cover "Your Content" with the license grant, but not "any Content". The user still has the right to post "any Content" if they have the appropriate license to do so, but obviously they can't grant additional licenses to content the user doesn't own.

In my understanding your reading is that users uploading code that they don't own the copyright to, but otherwise have the right to copy through a license, are in violation with the ToS in general.

My reading is that the license grant only applies to "Your Content" as defined in the ToS, and otherwise users are free to upload code with permissive license and it _does not grant_ additional licenses to Github.

The TOS is not a blanket grant for them to do anything they like with the material. As I said elsewhere: https://news.ycombinator.com/item?id=27823862

> Certainly the GitHub TOS grants them some common-sense ability to copy the code you upload so that they can usefully host it. Can you point to the portion that allows them to use it for Copilot?

> Because I'm pretty sure it doesn't. Section D4:

> > This license does not grant GitHub the right to sell Your Content. It also does not grant GitHub the right to otherwise distribute or use Your Content outside of our provision of the Service...

> You grant us and our legal successors the right to store, archive, parse, and display Your Content, and make incidental copies, as necessary to provide the Service, including improving the Service over time

> The “Service” refers to the applications, software, products, and services provided by GitHub, including any Beta Previews.

Google actually did do Copilot for Gmail. Nobody noticed though.
That's actually a great point. Ditto for GDocs. I've been pleasantly surprised at how good autocomplete suggestions have been in docs lately.

If I were to hazard a guess, I'd say that the vitriol around Copilot stems from five factors that distinguishes it from Google:

(1) The length of the suggestions alongside some of Copilot's marketing demonstrated that perhaps non-trivial replacement of engineers with AI might not be as far-fetched or far away as most people thought. Google's autocomplete has yet to make me feel replaceable.

(2) The content of the training data had a clearer intrinsic commercial value, making perceived license violations feel more 'real'.

(3) GitHub (historically) didn't have the same reputation as Google for training AI models on data uploaded to its free services. People likely (mis)placed some trust in GitHub when they uploaded code, and this backlash is part of the adjustment process.

(4) The indication that Copilot will eventually be a paid commercial service, effectively building a commercial service off the backs of millions of open source developers. While this is perfectly legal and common across all industries, it doesn't feel good.

(5) Copilot spitting out raw training data really doesn't help its image.