Hacker News new | ask | show | jobs
by craigacp 716 days ago
There are a bunch of these things in a word2vec space. I had a blog post years ago on my group's blog which trained word2vec on a bunch of wikias so we could find out who is the Han Solo of Doctor Who (which I think somewhat inexplicably was Rory Williams). You need to carefully implement word2vec, and then the similarity search, but there are plenty of vaguely interesting things in there once you do.
1 comments

It's a good point about true and false positives though, which makes me wonder if anyone's taken a large database of expected outputs from such "equations" and used it to calculate validation scores for different models in terms of precision and recall.