| The key here is developing good points of convergent measures - not all of which would have to be valid for any one review to be weighted (think a Bayesian scoring approach?) - Age of account - Account engagement (patterns matter too) - Check-ins via a mobile device at the business - Check-ins at other near-by businesses (patterns matter too) - Partner with a Credit Card company to offer Yelp-reward bucks to encourage reviews, track usage and validate reviews (and reviewers) [Use this to feed into the engagement score, above] - Partner with OpenTable (they do this) & prompt reviews after attendance (they do this) – weight these reviews more heavily. [Use this to feed into the engagement score, above] - Let me actually identify myself to Yelp or to the world (or to just business owners, ONLY if I want to [Use this to feed into the engagement score, above] - Reviews of similar businesses (e.g.: I like thai food) -- patterns matter here. Do I rate all competitors poorly, etc? Did I post all reviews in one day, etc? - Do my reviews vary significantly in a systematic way from others in a category? (This shouldn't be enough on its own, but variance might mean something) - Do I post photos of the place? (Factor into engagement score, above -- but if it's only for one business, it might be a flag) etc., etc. Really – a statistical model shouldn't be that hard to do -- maybe processor intensive, but hard? It doesn't feel hard, given all the data they're sitting on... |
http://www.cs.uic.edu/~liub/FBS/fake-reviews.html