Very interesting. Wondering what is the state of the art in Hyperparameter Optimization at the moment. Does this method apply to all Deep Learning systems?
For a general overview, this could be a good starting point [1]. As for deep learning, you may wanna start from here [2], but I personally had good results with Hyperband [3] for DL.
Hyperband [1] has been my go-to hyperparam optimization method over the past few years. Handily beats Bayesian search wherever I applied it, also implemented in most frameworks.
I just got back into hyperopt a couple weeks ago. It's easy enough and worked for me, but I was thinking there had to be some new things I'm not aware of.
[1] https://wires.onlinelibrary.wiley.com/doi/full/10.1002/widm....
[2] https://github.com/google-research/tuning_playbook
[3] https://jmlr.csail.mit.edu/papers/v18/16-558.html