Hacker News new | ask | show | jobs
by iaseiadit 844 days ago
Not an expert, but this paper explores double descent with simple models. The interpretation there: when you extend into the overparameterized regime, that permits optimization towards small-norm weights, which generalize well again. Does that explain DD generally? Does it apply to other models (e.g. DNNs)?

https://arxiv.org/pdf/2303.14151.pdf