Hacker News new | ask | show | jobs
by l3robot 3325 days ago
Where does the article state that zero training error is a good thing? The authors only show that almost every modern neural network can reach 0 training error, even if the labels are randomized (generalization impossible). Hence, they can learn the dataset by hearth. The authors can, from that, use the testing error as a generalization indicator.

Indeed a careful hyperparameter choice is the only key now to have good generalization. As I understood it, the goal here is more to show that the correlation between the regularization of the network and its generalization power is far from being clear as it is for other ML algorithms like SVM.

In short, NN hyperparameters help to reach generalization, but cannot "explain" it. It's the key difference here between practice and theory.