Hacker News new | ask | show | jobs
by doorhammer 4454 days ago
Thanks for the information. So it looks like it's a specific application of an algorithm to vectors of bigrams? The most relevant part of the wikipedia page (I think): http://en.wikipedia.org/wiki/N-gram#n-grams_for_approximate_...

It also appears that the algorithm I linked is actually the Sørensen–Dice index. They have the exact same formula on the wiki page: http://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coef...

Appreciate the heads up. Gave me much better terms to search for. Going to add them to the notes of my gist. I'm on vacation now, so I'll have to do more reading on it over the next few days

Also made a public gist for whoever is interested:

https://gist.github.com/doorhammer/9957864