| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SubiculumCode 2683 days ago
	To what extent is this not just finding text samples written in its training sample and regurgitating it near verbatim?? -Non ml guy

2 comments

HALtheWise 2683 days ago

If you look at their paper, section 4 is entirely devoted to this question. They present compelling evidence that it is generating original content, the simplest of which is it's ability to write coherently about ridiculous things like talking unicorns that nobody has ever written about in the training set.

https://d4mucfpksywv.cloudfront.net/better-language-models/l...

link

arcticfox 2683 days ago

The talking unicorns piece was shockingly good. That is at least as coherent of a news story than the average human could easily invent about it.

Reading that piece gives me the same weird feeling as watching AlphaStar playing through a StarCraft game.

link

applecrazy 2683 days ago

You bring up a good point. Without seeing their code and training metrics, how do we know that this isn’t some extremely overfitted model?

link

vedant 2682 days ago

From the paper:

"All models still underfit WebText and held-out perplexity has as of yet improved given more training time."

link