Hacker News new | ask | show | jobs
by mccourt 2626 days ago
I always appreciate articles emphasizing the importance of hyperparameter optimization; thank you for writing this. The discussion on learning rate is nice additional point to mention, though I find it a bit misleading -- earlier in the discussion you are mentioning a number of hyperparameters but then learning rate is studied in a vacuum. If other hyperparameters were varied along with the learning rate, I assume those graphics would look much more complicated.

Additionally, practical circumstances for hyperparameter tuning using Bayesian optimization often include complications: dealing with discrete hyperparameters, large parameter spaces being unreasonably costly or poorly modeled, accounting for uncertainty in your metric, balancing competing metrics, black-box constraints. Obviously, one cannot mention everything in a blog post, I just wanted to bring up that outstanding researchers in Bayesian optimization are pushing forward on all of these topics.

Regardless, thank you for continuing to hammer home the value of hyperparameter optimization. If I may, a couple links, for anyone trying to learn more:

My favorite BO intro - https://arxiv.org/abs/1807.02811 AutoML from the Freiburg crew - http://papers.nips.cc/paper/5872-efficient-and-robust-automa... Some discussion on parallelism/high dimensions - https://bayesopt.github.io/papers/2017/3.pdf Strategies for warm starting - https://ml.informatik.uni-freiburg.de/papers/18-AUTOML-RGPE....