|
|
|
|
|
by mccourt
2178 days ago
|
|
A very reasonable point and, certainly, the direction that parts of the computational community have embraced over the years. I will use integration as an example: classic computational methods were focused on trying to make strong assumptions about the integrand and significantly reduce the number of integrand evaluations (Gauss quadrature is the main thing that comes to mind). As computation became more accessible/parallelizable, and problems became less analytic, Monte Carlo methods have become more fundamental. In some distributed computational settings, memory traffic is actually the main bottleneck and redundant computations are executed to reduce the need to send data (a similar situation to the one you aptly describe). I think that, in the case of hyperparameter/meta-learning optimization (or search, depending on how you think about it) we are at a time right now where the complexity of models which can effectively be put into production is a function of our ability to, at least partially, analyze the space of possible modeling decisions. Will we escape that, and have models whose training cost is less significant than the cost of executing an "intelligent" hyperparameter search process? Maybe ... I am a GP person so I see potential in clever analysis of circumstances so that RKHS methods (for instance) can be leveraged and simplify the training process. But the current trajectory of the community has been to work on increasingly expensive models, which makes the ability to effectively use them with limited tuning/search cost still relevant. |
|
Otherwise I agree, Gaussian Processes are nice and friendly, and work quite well for low-dimensional search (e.g., from a few to hundreds of hyperparameters) under very natural, general assumptions :-)