Hacker News new | ask | show | jobs
by machinekob 1212 days ago
I hate when people don't include approximation for traning before final hyperparameters are found as its most costly part of whole process most of the time.

Just yes we train it for so long etc. but they never speak about tens or even hundres of runs before they finalize the model parameters and architecture -.-

1 comments

Aren't those done on smaller version of the same model?