Hacker News new | ask | show | jobs
by Mathnerd314 3520 days ago
My guess is it only parses certain word forms. "Blatant state-shtuppers" is in this blog post: http://www.transterrestrial.com/?p=63723

> "Let’s put blatant State-shtuppers such as Hillary, Bernie, and Obama at about 7 or an 8."

This matches Hearst Pattern #1 from https://www.microsoft.com/en-us/research/wp-content/uploads/...:

> NP such as {NP,}*{(or, and)} NP

Hillary usually appears by herself, rather than in a list. Apparently Probase doesn't pick up the plentiful "X is a Y" associations, e.g. the "Hillary is a liar" from http://thefederalist.com/2015/08/27/poll-voters-overwhelming... or "Hillary is a candidate" from http://www.huffingtonpost.com/jeffrey-sachs/hillary-is-the-c...

Or maybe it does, and they're ranked down. They do have a truth-detection phase, but it's mostly syntactic, and the top categories all have negative examples ("Hillary is not a candidate", "Hillary is not a democrat", etc.).