Hacker News new | ask | show | jobs
by geedy 2736 days ago
Can you elaborate on the benefit for a high number of hyper parameters?
2 comments

A fundamental problem is as the number of parameters increase the probability of sampling from the edge of the hypercube increases. You will then not effectively explore the parameter space. This might be some what alleviated by a concentrated multivariate normal, but I guess that has its own caveat.

If you instead have a sampling algorithm informed by the loss functions you avoid this problem. (You instead might have to worry about local minima.)

For small numbers of hyperparameters, sometimes just random search is enough. This is not an absolute rule, sometimes with just 4 parameters random search miserably fails... just my rule of thumb, empirically, is that for hyperparameters in machine learning (this is certainly not the case in general) random search is often enough for 4 to 12 hyperparameters if the budget for hyperparameter search is ~100 trainings.