Hacker News new | ask | show | jobs
by zanethomas 932 days ago
Hmmm, odd. I was under the impression they used cosine similarity based on page content. Once upon a time, based on that 'memory', I created a system to bin domain names into categories using cosine similarity. It worked surprising well.

Regardless, well done!

1 comments

Hmm, seems like something that might be used for deduplication maybe?