| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by esafak 408 days ago
	...because stochastic methods are implicit regularizers, leading to solutions that generalize better. Let's spell it out for those that don't know. https://www.inference.vc/notes-on-the-origin-of-implicit-reg...

1 comments

jampekka 408 days ago

OLS is a convex optimization problem, so this doesn't really apply. And for statistical analysis you really don't want to add poorly understood artificial noise to the parameter estimates anyway.

link

esafak 408 days ago

In general you do, because the unbiased estimates have higher generalization error. You are already dealing with sampling noise. I am not an expert in optimization, and what "poorly understood" means to you, but I know there is quite some research on the properties of SGD noise; e.g., https://francisbach.com/rethinking-sgd-noise/

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning https://arxiv.org/abs/2301.13703

link