Hacker News new | ask | show | jobs
by gugagore 1813 days ago
I can't say what's happening in GitHub Copilot, but it's not necessarily true that the only way to produce syntactically valid outputs is to take substrings of the source text. It is possible to learn something approximating a generative grammar.

Take a look at https://karpathy.github.io/2015/05/21/rnn-effectiveness/

At the same time, I would not be surprised if there are outputs that do correspond to the source training data.