Hacker News new | ask | show | jobs
by srean 5217 days ago
Oops hit the down-arrow without intending to, my bad, hope someone will fix that.

There is nothing tacked on about a regularizer though, it is very sound even in theory. There are several ways to look at it. One way is to see it as a natural consequence of Bayes law, it is just the log of the prior probability. There are certain things we know or assume about the model even before looking at the data, for example we expect the predictions to have a certain smoothness etc, all this knowledge can incorporated into the prior model, and that is what the regulaizer is. Another way to look at it from stability of the estimates of the parameters. I find the former more convincing.

1 comments

Absolutely, there's a pretty clear mathematical justification for regularization. However, it is very literally tacked on at the end. Take logistic regression, if you minimize the cost function without regularization, you get a max-likelihood estimate of the regression parameters. But what we do is to add a regularization term to that cost function. Minimizing that cost-function will no longer give a MLE solution, but it will (likely) give a better solution. It all comes down to understanding that the MLE property is an asymptotic result. Same goes for covariance matrix estimates, where you have regularization procedures that are guaranteed to never be worse than the plain MLE solution.