| I've been saying this for years, thumbs up/down is the only system that makes sense to me. Foursquare uses it and I've found their scores to be way more useful than Yelp's. The biggest problem with star ratings is that it's so arbitrary. What is the difference between 3 and 3.5? What is a 1 vs a 2? 3/5 is 60%, that's almost failing when you think about it on a grading scale, if I scored something as a 3/5 I would never use that product or service again, yet, many of the best restaurants are rated 3/5 on Yelp. Unless the user has some scoring system in place for different qualities of the product or service, there is no way you can get anything resembling an accurate score. I would never trust a user to accurately assess a score given 10 different options (.5-5) but I would be way more likely to trust a user to say either "I like this product" or "I do not like this product." But yes, the Wirecutter approach works great, but it just doesn't scale. |
If a place has more 5-star ratings than 4-star ratings, it's generally amazing. If it has more 4-star ratings than 5-star ratings, it's generally fine but not something particularly special.
Just thumbs up/down would eliminate what is, to me, the single most useful aspect of Yelp.
It doesn't matter that star ratings are arbitrary -- when you average enough of them out, a clear signal overrides the noise. You can distrust any given user, while still trusting the aggregate.
(Curiously enough, I don't find any equivalent value on Amazon. On Yelp, you're really evaluating an overall experience along a whole set of dimensions, so there's a lot more to discriminate on. On Amazon, it does seem to be more of a binary evaluation -- does the product work reliably or not?)