Hacker News new | ask | show | jobs
by rmoriz 269 days ago
Microsoft/GitHub has no model training. How do you think Copilot works? Also if you provide open source, people and companies are gonna use it.
2 comments

When I publish open source code, I don't mind if people or companies use it, or maybe even learn from it. What I don't like is feeding it into a giant plagiarism machine that is perpetuating the centralization of power on the internet.
to me plagiarism is a 100% copy of intellectual property or maybe a high percentage, like 80%+

LLMs don't store the code, only the probability chains of tokens (words). AFAIK this is not plagiarism.

I remember the later 2000s, when a German company called "Rocket Internet" was copycatting companies like AirBnB, Zappos and others. Many consider this lame and some kind of moral freeloading, it's not prohibited.

> Microsoft/GitHub has no model training. How do you think Copilot works?

I'm sure if you used that big, smug brain of yours you'd piece together exactly what I meant. Here's a search query to get the juices flowing:

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

Whether you agree with why someone may be opting to self-host a git server is immaterial to why they've done so. Likewise, I'm not going to rehash the debate over fair use vs software licenses. Pretending like you don't understand why someone that published code under a copyleft license is displeased with it being locked in a proprietary model being used to build proprietary software is willful ignorance. But, again, it makes no difference whether you're right or they're right; no one is obligated to continue pushing open source code to GitHub or any other service.