Hacker News new | ask | show | jobs
by aix1 96 days ago
The 42 -> 137 also jumped out at me. On the face of it, the associated improvement sure does sound like overfitting to the eval set.