|
|
|
|
|
by throw_away_777
3318 days ago
|
|
This statement:
"Or in other words: the model, its size, hyperparameters, and the optimiser cannot explain the generalisation performance of state-of-the-art neural networks."
is not true and very misleading. Careful selection of hyperparameters and the model can clearly improve generalization - the article is making a mistake in assuming that getting to zero training error is a good thing or a desirable thing. In fact a large part of hyperparameter optimization are choices that ensure generalization, and some of the fundamental choices such as early stopping and many others do determine how well the model generalizes. If your model has zero training error you have likely made poor choices. |
|
Indeed a careful hyperparameter choice is the only key now to have good generalization. As I understood it, the goal here is more to show that the correlation between the regularization of the network and its generalization power is far from being clear as it is for other ML algorithms like SVM.
In short, NN hyperparameters help to reach generalization, but cannot "explain" it. It's the key difference here between practice and theory.