| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vedant 2721 days ago
	From the paper: "All models still underfit WebText and held-out perplexity has as of yet improved given more training time."