Hacker News new | ask | show | jobs
by alixaxel 3719 days ago
Levenstein is an absolute metric, I think something like soreson-dice would be more useful.

Regardless, if you take the short keywords and blacklist them by approximation with curse words from several languages I think it would be really hard to get something at all.