|
|
|
|
|
by esafak
271 days ago
|
|
SLT typically uses empirical risk minimization, leading to the bias-variance decomposition and a unimodal extremum as the monotonically decreasing bias supposedly balances against the monotonically increasing variance. We now know this does not accurately model overparameterized models, which exhibit double descent, and other phenomena like grokking. To explain them you have to look past classical statistics to statistical mechanics. |
|