|
|
|
|
|
by pps43
3333 days ago
|
|
The choice of m and C need not be exact. It is enough to choose them so that 1. If there are no ratings, Bayesian average is close to overall mean, and 2. If there are many ratings (how many depends on how big the site is), C and m do not affect the result much. You probably can do a little better if you have a lot of data and ability to run A/B tests, but for vast majority of cases pseudocounts work just fine. |
|
It would be interesting though to have people try to guess suitable values of m and C and then see how close their MSEs get to the James-Stein MSE. I suspect that some people's guesses would be meaningfully off target.