| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Salgat 1731 days ago
	To add to this, there's a misleading phenomenon that first occurs where the performance actually gets worse with too much data/parameters/epochs, but oddly improves again if you throw even more at the model.

2 comments

jointpdf 1731 days ago

For the interested, this phenomenon is known as (deep) double descent:

https://openai.com/blog/deep-double-descent/

https://www.lesswrong.com/posts/FRv7ryoqtvSuqBxuT/understand...

(Edit: Oh, the definition appears in the abstract of the linked paper.)

link

nexuist 1731 days ago

Is this the ML equivalent of Dunning–Kruger effect? A model with a bit of data is too afraid of being wrong to be overconfident. A model with a bit more data is overconfident in itself and gets things wrong. Finally, a model with tons and tons of data understands the complexity of the problem set and once again becomes too afraid of being wrong.

link

visarga 1731 days ago

Model confidence as reported by softmax probability scores is notoriously noisy and miscalibrated. With larger models and more data the confidence estimation gets more nuanced.

link