Hacker News new | ask | show | jobs
by nexuist 1729 days ago
Is this the ML equivalent of Dunning–Kruger effect? A model with a bit of data is too afraid of being wrong to be overconfident. A model with a bit more data is overconfident in itself and gets things wrong. Finally, a model with tons and tons of data understands the complexity of the problem set and once again becomes too afraid of being wrong.
1 comments

Model confidence as reported by softmax probability scores is notoriously noisy and miscalibrated. With larger models and more data the confidence estimation gets more nuanced.