|
|
|
|
|
by lars
5217 days ago
|
|
A good example is regularization. You have nice proofs saying that your classifier is optimal, then you tack on a regularization term to it, which breaks your optimality proof but improves your classification accuracy. It seems unexpected, but it's not really all that surprising when you get down to the details of it. |
|
There is nothing tacked on about a regularizer though, it is very sound even in theory. There are several ways to look at it. One way is to see it as a natural consequence of Bayes law, it is just the log of the prior probability. There are certain things we know or assume about the model even before looking at the data, for example we expect the predictions to have a certain smoothness etc, all this knowledge can incorporated into the prior model, and that is what the regulaizer is. Another way to look at it from stability of the estimates of the parameters. I find the former more convincing.