Hacker News new | ask | show | jobs
by projectorlochsa 3365 days ago
Well this does make a little bit of sense if after estimating things get worse for a particular case.

If MSEs were 5, 5, 100, and after they are 10, 10, 80, the total got better but the prediction for the other two got worse.

2 comments

This is not why it works. The James-Stein estimator applies in the case of independently distributed Normal variables with equal variance, so the individually optimal estimators for each parameter have the same MSE.
I think it is impossible James-Stein estimator of the whole sample can outperform all of the three MLE for each particular. Not that I would know of a way to generate a particular from a joint estimator.
I think I may have misinterpreted your original comment. It's true, as you say, that in the Stein estimator some individual MSEs get worse to yield a lower overall MSE. I was focusing on your example and assuming you meant this was only possible because some individual MSEs were larger than others to begin with (under the individually optimal estimator).
Yeah, reading through the Wikipedia, it looks like this reduces the total error of the combined estimator, but the error compared to an estimator of any one single parameter could be worse. So you can combine whatever crazy parameters you want, but it's only really relevant when you have things that are associated with each other somehow, and you want to reduce the total error of estimating all of them.
> it's only really relevant when you have things that are associated with each other somehow

The proof of lower overall MSE assumes the variables are independent.