|
|
|
|
|
by mturmon
4420 days ago
|
|
Everything you say is right, of course. Yet, I upvoted the story. I thought it was indicative of a larger trend where crowdsourced data are used to illustrate a point. Like the Google flu trends articles, which have gone around HN at least twice, once when they were successful (https://news.ycombinator.com/item?id=5040204) and once when they were critiqued (e.g., https://news.ycombinator.com/item?id=7455307). I work a lot with sampled data, and I have found that sampling issues can be some of the most difficult to appreciate and to quantify -- even for experts. I guess it comes down to sampling from one distribution, P(x), when the situation you really care about samples according to a different distribution P'(x). If P is far from P', your conclusions from P can be arbitrarily bad. If you have an adversary moving P around deliberately, as here, it's even worse. |
|
If there is an interesting statistical result it's that the movie's rating is entirely consistent with crowd sourced predictions. The theory is that 'wisdom of crowds' results directly from diversity among those making predictions.[1] In the case of the lowest rated movie, those making predictions were unusually homogeneous, and therefore an inaccurate prediction as to the quality is unsurprising.
Again, it's all in the interpretation, e.g. there's statistical evidence that a lot of morons ranked the The Matrix.
[1] Diversity Prediction Theorem: http://vserver1.cscs.lsa.umich.edu/~spage/ONLINECOURSE/predi...