Y
Hacker News
new
|
ask
|
show
|
jobs
by
hashta
38 days ago
I think I trained models with #params >> #training examples for hundreds of epochs, but still don't recall seeing that loss curve on real data. Curious if others have seen it with larger models or much longer runs