Hacker News new | ask | show | jobs
by yantrams 809 days ago
This is my go to method for pretty much every hard problem that I'm forced to solve where I don't have the domain expertise / interest / time. The trick lies in coming up with a clever similarity metric that incorporates penalties etc. You can even go a level deeper and use multiple similarity algorithms and then poll on top of them. Here's a taxonomy extractor for text that I made using similar principles that is surprisingly as good as anything else that I've seen - https://dash.scooptent.com/text