Hacker News new | ask | show | jobs
by SubiculumCode 2683 days ago
To what extent is this not just finding text samples written in its training sample and regurgitating it near verbatim?? -Non ml guy
2 comments

If you look at their paper, section 4 is entirely devoted to this question. They present compelling evidence that it is generating original content, the simplest of which is it's ability to write coherently about ridiculous things like talking unicorns that nobody has ever written about in the training set.

https://d4mucfpksywv.cloudfront.net/better-language-models/l...

The talking unicorns piece was shockingly good. That is at least as coherent of a news story than the average human could easily invent about it.

Reading that piece gives me the same weird feeling as watching AlphaStar playing through a StarCraft game.

You bring up a good point. Without seeing their code and training metrics, how do we know that this isn’t some extremely overfitted model?
From the paper:

"All models still underfit WebText and held-out perplexity has as of yet improved given more training time."