Hacker News new | ask | show | jobs
by newhouseb 5400 days ago
> When you view unfiltered results, the per page number mysteriously changes to 10 per page. [...] Plus the results are pretty slow to load, quite slower than the results for filtered reviews.

Caching, I'm sure most unfiltered reviews are cached whereas filtered reviews are not and reaching out past the cache can be expensive. One way to mitigate this is to reduce the number of results you pull.

> Why do you need to enter in a captcha to view the unfiltered reviews? Why would they care if you were a bot only for the unfiltered reviews and not the normal reviews?

If you can write a script to deduce the filtering algorithm then you can by definition write reviews that thwart it. With less data, it is harder to deduce the filtering algorithm. In other words, a captcha thwarts high-volume review fraud.

> The filter algorithm seems to be clearly flawed and simply catches way too many reviews that should not be filtered.

I think most people seem to underestimate the difficulty of the problem. Unlike e-mail spam, which is easy for a human to spot, fake reviews are very hard for a human to spot. How can you tell if a consumer was provoked into writing a positive review so that they could get a few bucks off their order just from their writing? You can't, you can look at other statistical trends behind such reviews (such as a sudden wave of positive reviews), but you're only looking for side effects of the primary problem and thus you will never achieve perfect performance from a method like this.

Yelp takes the (somewhat philosophical) viewpoint that customers who are coerced into writing a review are less genuine than they would be otherwise. I believe that this view drives a lot of their algorithm and possibly threatens its accuracy in a way that is ultimately not worth it. I think there are a number of things that Yelp could do to make the users trust in reviews greater that don't involve filtering - one simple thing would be for a user's review of an Indian restaurant to show me that user's breakdown of reviews of other Indian restaurants.

TL;DR: This is a much harder problem than it seems at first glance, partly because of the nature of the problem and partly how Yelp has framed it for themselves.

Disclaimer: I used to work at Yelp, but no longer do. Everyone I worked with were stand-up guys.

2 comments

I guess most of your points make sense. I just feel like Yelp does very little to be open and transparent. I get that its very difficult, nobody ever said its easy to algorithmically guess review spam.

But they clearly don't want users to see unfiltered reviews. A tiny gray link below all 40 reviews, then a captcha (or two or three) and then a slow user experience before you can see the filtered reviews is lame.

I agree with you about the showing other reviews of the same subject, that would be neat. I guess if I were Yelp, I would try harder at standing up for their algorithms and show more data about why they work and why we are better off having their amazing algorithms.

I had an experience a year or so ago with a friend who started a moving service in SF. A couple of months after he started the business, he noticed he received a review on Yelp from some dude that said during a moving job, the guy took a smoke break and peed all over the sofa he was moving. Not only was the story ridiculously false but my buddy had no idea who the reviewer was. The review did however NOT get filtered, even after he responded to the review and contacted Yelp. And he was stuck with this crazy review at the top of his profile. This went on for months and it really damaged his credibility, meanwhile he would have positive reviews from legitimate customers who would naturally have a newer profile or whatever and the reviews would get filtered. It just seems like Yelp should be more sophisticated. (And yes, they are 10000% better than TripAdvisor)

Thank you newhouseb! This is the most intelligent comment I've seen on the issue, by far.

(I hope this comment makes it through the filter, I swear it's not a fake... I don't even know newhouseb...)