Hacker News new | ask | show | jobs
by minimaxir 1015 days ago
A workaround to overfitting is to train on so much distinct data that the model can't overfit.

Newer large datasets like the ones used here optimize for diversity. (e.g. SlimPajama is a heavily-deduped dataset)