Hacker News new | ask | show | jobs
by yariang 5183 days ago
While it is good to look at these sorts of mathematically rigorous algorithms, I think I would be frustrated if it was used everywhere. Or, well, maybe not me perhaps, but a non technical user.

The beauty of the second algorithm for rating products is that it is straightforward. Having never seen it before I can deduce that 5 stars come before 4 stars and more reviews come before fewer. If I want to skip ahead to the 4 stars I know what to do. I can internalize the sorting algorithm easily. And as a user, understanding the order items are presented to me is important.

If Amazon were to use the last algorithm and present items in that order (assuming we accounted for the 5 star vs positive/negative issue), it would like a random order to most users and would be frustrating.

So I guess what I am saying is that this algorithm is very clever, but in some cases, it may be too clever. Sometimes you just want to keep it Simple Stupid.

3 comments

The second algorithm, in my experience, is too simple, though. When browsing Amazon I'm pretty regularly annoyed by an item with one 5* review appearing ahead of an item with hundreds of 4* and 5* reviews.

One simple fix would be to avoid calculating an average until a minimum number of ratings have been given. But I do think the statistical way is lovely. If I were Amazon I'd give it some kind of snappy trademarked name and push it as a feature.

Instead of displaying stars, Amazon could display a percentage, which under the hood represents the Wilson confidence number. It would be totally intuitive to browse: first come all the 100% items, then the 99's, and so on.
You can't use Wilson's confidence with a star-rating system. Wilson's only works for binary systems.

Instead you could use a weighted baysian rating:

br = ( (avg_num_votes * avg_rating) + (this_num_votes * this_rating) ) / (avg_num_votes + this_num_votes)

er... and the problem with bucket categorizing?

80-100% = * * * * *

60%-79% = * * * *

etc..

How is it a problem if the five-star reviews display first, then the four-star, and so on?
The point is we have to determine how to define a five-star item, a four-star one, etc. Currently, an Amazon item's star value is the average of the star values of every review. The author is saying that that's a bad way to compute the item's star value. The author would argue an item with only two reviews that are both fives should have a lower star value than an item with 400 fives and 1 four. We typically associate stars with the averaging algorithm (i.e. we define an item's star value as the average of the star values of its reviews), so it might help if we do away with the notion that each item has a star value, and just think of this as saying an item with 400 reviews of 5 stars and 1 review of 4 stars should be shown before an item that just has 2 reviews of 5 stars.

Currently, when we see an item's star value, we think of it as an indicator of the quality of the item. But if it's just the average of the star values of every review, the author would argue that we're not going to get an accurate indicator of quality. The author argues that whether the quality indicator of an item is expressed in stars or percentages, that value should be determined by the third algorithm, not the second, and that the order the items are shown in should be the result of sorting those quality indicators.

But how is Simple Stupid in the Amazon case a better output for the user? Do you, as an Amazon shopper, really believe that the item with one 5-star review is a better bet for you than the item with 580 reviews and an average of 4.5-stars?
I don't, but I can intuitively grasp that a 5-star item with 2 reviews is not reliable. Since I understand how the sorting works, I know I have to jump to the 4.5 star items in order and check how many reviews that item has and if it also has a small number then I will jump to the 4 star items.

The point is, I understand the sorting order and can manipulate them if I am not satisfied with what is presented to me. Having a very esoteric algorithm is a risk. Maybe you'll present just what the user really wanted. But if you get it wrong they will be lost to do anything about it. I tend to dislike systems that leave users helpless when something goes wrong.