|
|
|
|
|
by iaseiadit
844 days ago
|
|
Not an expert, but this paper explores double descent with simple models. The interpretation there: when you extend into the overparameterized regime, that permits optimization towards small-norm weights, which generalize well again. Does that explain DD generally? Does it apply to other models (e.g. DNNs)? https://arxiv.org/pdf/2303.14151.pdf |
|