Hacker News new | ask | show | jobs
by kbar13 4068 days ago
i'm not sure that having to provide PII to make a review is a good idea. Maybe if yelp is able to provide a kind of "reviewer's score" based on the reviewer's online activity then people will be more likely to trust reviews for being genuine.
1 comments

The key here is developing good points of convergent measures - not all of which would have to be valid for any one review to be weighted (think a Bayesian scoring approach?)

- Age of account

- Account engagement (patterns matter too)

- Check-ins via a mobile device at the business

- Check-ins at other near-by businesses (patterns matter too)

- Partner with a Credit Card company to offer Yelp-reward bucks to encourage reviews, track usage and validate reviews (and reviewers) [Use this to feed into the engagement score, above]

- Partner with OpenTable (they do this) & prompt reviews after attendance (they do this) – weight these reviews more heavily. [Use this to feed into the engagement score, above]

- Let me actually identify myself to Yelp or to the world (or to just business owners, ONLY if I want to [Use this to feed into the engagement score, above]

- Reviews of similar businesses (e.g.: I like thai food) -- patterns matter here. Do I rate all competitors poorly, etc? Did I post all reviews in one day, etc?

- Do my reviews vary significantly in a systematic way from others in a category? (This shouldn't be enough on its own, but variance might mean something)

- Do I post photos of the place? (Factor into engagement score, above -- but if it's only for one business, it might be a flag)

etc., etc.

Really – a statistical model shouldn't be that hard to do -- maybe processor intensive, but hard? It doesn't feel hard, given all the data they're sitting on...

Work has been done on this. Apparently you can detect fake reviews to some degree using text features alone:

http://www.cs.uic.edu/~liub/FBS/fake-reviews.html

I'm pretty sure the review filter (that everyone loves to hate) works this way.
I've assumed it's something like that - but it shouldn't be too hard for them to be a little more forthcoming to explain their methodology a little bit without compromising the value of it.