|
|
|
|
|
by adlpz
1137 days ago
|
|
I agree there's a difference in the ease of access, a competitive advantage, sure. And I get that people writing public-source (however licensed) software don't want to make it easier for them (as in, Microsoft) to make money off of "learning" (of the machine type) from it. That's fair. However, at a first glance, it still feels to me like an unavoidable reality that if you publish source code it'll eventually be ingested by Copilot or whatever comes next. I mean, for the rest of the content all the new fancy LLMs have been trained with, there wasn't a Github equivalent. They just used massive scraped dumps of text from wherever they could find them, which most definitely included trillions of lines of very much copyrighted text. In short: not only I don't really see an issue with Copilot-like AIs learning from publicly available code (as I described in the GP comment) but I also think if you publish code anywhere at all it's inevitable that it'll end up in Copilot, regardless of where you host it. If you want to make it more expensive for Microsoft to scrape it, sure, go ahead, but I don't think it matters in the long run. |
|
I’d be quite careful with of this view.
By your logic, it should be ok to take the Linux kernel, copy it, build it, then sell it and give nothing back to the community that built it. Then just blame it on the authors for uploading it to the internet ?