Hacker News new | ask | show | jobs
by ActivatedAI 691 days ago
There is some good published research about doing multiple passes over the training data, and how quickly learning saturates. The TL:DR is that diminishing returns kicks in after about 4 epochs.

https://arxiv.org/abs/2305.16264

1 comments

Yep I have seen this paper before, and thank you for linking it here for reference. My personal opinion is that compared to single epoch scaling laws, we still need more evidence and literature on effects of multiple epochs, but this paper is one of the best results we have so far on using multiple epochs.