| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by RC_ITR 1016 days ago
	Not to be a downer, but wasn’t one of OpenAI’s earliest discoveries that training small models on huge datasets leads to over-fitting? It’s my understanding that the entire race to ever-more parameters was driven by that.

2 comments

A workaround to overfitting is to train on so much distinct data that the model can't overfit.

Newer large datasets like the ones used here optimize for diversity. (e.g. SlimPajama is a heavily-deduped dataset)

Learn about the magic of double descent

Yeah, the line keeps going down as the model gets bigger. What's your point? That there's a hump in the middle?