Hacker News new | ask | show | jobs
by nialv7 496 days ago
Slightly off topic, I rarely see paper talks about their failed training runs, and why those runs failed. This paper is definitely a breath of fresh air. And their analyses of their failures, the changes they made to fix them, and the rational behind that, are all very insightful.
1 comments

The R1 paper did it as well. Agreed, it's always very interesting.