Hacker News new | ask | show | jobs
by natefinch 1464 days ago
There is logic to ensure that copilot does not emit exact duplicates of code in the training set... but that logic is significantly newer than that tweet.
2 comments

Link? I couldn't find anything "significantly newer" than 7/2/21 (though I'm sure GitHub is doing a lot here). They had this blog post 6/30/21 regarding efforts on avoiding raw code: https://github.blog/2021-06-30-github-copilot-research-recit.... They concluded:

> We will both continue to work on decreasing rates of recitation, as well as making its detection more precise.

Source: I work on the copilot team.
Was that decision informed by legal or product? Because derivative works are still derivitative works even if you don't replicate the original verbatim.
I mean, it was informed by both, but basically everyone thinks it's a good idea.