|
|
|
|
|
by munchler
457 days ago
|
|
In the ML problem I'm working on now, there are about a dozen simple hyperparameters, and each training run takes hours or even days. I don't think there's any good way to search the space of hyperparameters without a deep understanding of the problem domain, and even then I'm often surprised when a minor config tweak yields better results (or fails to). Many of these hyperparameters affect performance directly and are very sensitive to hardware limits, so a bad value leads to an out-of-memory error in one direction or a runtime measured in years in the other. It's a real-world halting problem on steroids. This is not to even mention more complex design decisions, like the architecture of the model, which can't be captured in a simple hyperparameter. |
|