| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sethorion 6165 days ago

Question:

Greg Linden, who worked on Amazon.com's recommendation engine, has referred to what he calls the "harry potter problem". To quote from his blog:

'...this calculation would seem to suffer from what we used to call the "Harry Potter problem", so-called because everyone who buys any book, even books like Applied Cryptography, probably also has bought Harry Potter. Not compensating for that issue almost certainly would reduce the effectiveness of the recommendations, especially since the recommendations from the two clustering methods likely also would have a tendency toward popular items.'

How did you compensate for this problem? Do you simply ignore vertices in the graph that have a large degree?

Or, are you using non-linear weighting functions, such as a perceptron's sigmoid function?

With regard to Wikipedia, almost everyone who has edited an article has also edited the article on Bill Clinton. So, if you are using the edit-history metadata to compute recommendations, you would have to compensate for the "Bill Clinton problem".