Hacker News new | ask | show | jobs
by credit_guy 3365 days ago
Here's my intuition. Let's say you have 1000 coin flippers. They flip a coin 10 times, and none of them has any special powers, and the coin is fair. Some of them will get an equal number of heads as tails, but there's a good chance you'll get tsome who get 9 or 10 heads, and also some who get 9 or 10 tails. As the probability to get 10 heads in a row is 1/1024, if you see one or two guys how get only heads, or only tails, you will attribute that to the natural variability of the outcomes.

Now imagine that these are not coin flippers, but some guys who have some skills to do something, but the outcome has a large variability nonetheless. For example running backs in the NFL league. There are running backs (RB) who average 2 years per carry (ypc), and others who average 5. 5 ypc is stellar by the way, 4 is very good, 3 is decent, and 1 or 2 not so much. But obviously, RBs get a different yardage for each carry. Now, let's say you follow the first 4 games of the season and get the average ypc for each RB. You would like to predict for each RB the average ypc for the rest of the year. The classical statistical estimation is that the current average is the best estimator for the future average, but from the extreme example with the coin flippers above, we know that this is not quite the case. Using a bayesian estimation, we get that a better estimator is if we move the current average towards the overall mean. This is called a shrinkage or James-Stein estimator. In the case of the coin flippers, you move the average all the way to 1/2, and that estimator is correct. In the case of the running backs, you don't shrink that much, and it's a cute exercise in math to see how much you shrink if you assume some distributions around the overall ypc for RBs in the league and around the ypc of an RB given his average ypc.

If you want some further intuition, think of the Sports Illustrated curse. It was observed that NFL players who make it to the cover of the SI magazine are generally "cursed", i.e. they don't do as well after as they did before. One amusing case is the (former) New England Patriot Jonas Gray, who made the cover of SI after a phenomenal game with the Indianapolis Colts in 2014 (201 rushing yards, 4 touchdowns), but then he showed up late to work and was promptly benched for the rest of the season. Generally though, players don't do anything stupid like that, but simply "regress to the mean". That regression to the mean is what explains the shrinkage estimator, and the Stein paradox.