Y
Hacker News
new
|
ask
|
show
|
jobs
by
alyxya
107 days ago
Fundamentally I don't believe second-order methods get better data efficiency by itself, but changes to the optimizer can because the convergence behavior changes. ML theory lags behind the results in practice.