Hacker News new | ask | show | jobs
by Kiro 1821 days ago
No, that's not how it works.

"[...] the vast majority of the code that it suggests is uniquely generated and has never been seen before. We found that about 0.1% of the time, the suggestion may contain some snippets that are verbatim from the training set"

https://copilot.github.com/#faqs

2 comments

If that's the case (only 0.1%), the developers must have done something that differs from other openai experiments that suggest code sequences that I recall seeing, where significant chunks of code from Stack Overflow or similar sites were appearing in answers.
So you're gambling on whether that the code that was generated or copied.
No you aren't. Courts will consider it fair use.
How are you going to prove it was the AI that generated the GPL licensed function ad verbatim from another project, rather than you just opening that project and copying the function yourself?
I will not. Courts will simply consider a single function not to be substantive enough piece of work to constitute unfair use.
use a bloom filter to skip/regenerate that 0.1%