| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jackson1372 2384 days ago

The reason you want to over-parameterize your model is that it protects you from "bad bounce" learning trajectories. You effectively spread out your overfitting risk until it's pretty close to 0.

Or at least that's the way I like to think of it.

The next step is to better compress the resulting model in a simpler, less computationally costly network.

1 comments

gyuserbti 2384 days ago

Are you suggesting dd is about local minima sort of? Like if you extended the risk: parametrization curve out you'd start to see overfitting again?

link