|
|
|
|
|
by winety
1455 days ago
|
|
> There is a problem because some licenses require attribution, but ignoring that... Surely the solution would be to give credit to every author from the training corpus. I am looking forward to the 10 000 lines of copyrights in every header. :P If Microsoft had trained it on its own code, there would be no such problems. Surely a company as large as Microsoft has produced enough code over the years to create a large enough training dataset. |
|
I keep seeing this sentiment from the GPL/"laundering" side of the debate.
Believe me, Microsoft wouldn't have released this thing (after what, 6 months of beta testing?) if they thought they had any "problems" at all.
I'm not saying I don't sort of agree with you, but is there no room for what's actually _likely_ to happen in this debate? Because as best as I can tell, they aren't going to see any real legal issues from this.
(There's also an option to remove generations that result in a collision with actual GitHub code, just fyi)
I feel like when the singularity happens HN is going to be flooded with programmers mad that they got automated away despite it very much being one of the primary goals of computer science and software engineering. This stuff is a kind of just a fact of life now.
Salesforce trained models (on GitHub) competitive with copilot without needing to own GitHub. I would spend less time worrying about how to lawyer up and more time figuring out how you're going to adapt to these new tools. That's the gig.