Hacker News new | ask | show | jobs
by mack73 3360 days ago
Thanks, The idea behind the BK-tree is ingenious.

I'm struggling with finding a use case for that data structure through. Why would you construct a BK-tree that would only become powerful when it contains millions of words, which would then create a nuisance when representing that amount of data in memory, making it not so fast anymore, when you could represent the same data in a compressed form and with the same (as well as an extended set of) querying capabilities?

Perhaps BK-trees are for big machines with powerful CPUs? I'm sure there is a setup that would make that tree in fact better than any other tree.

2 comments

I don't think the best use case for BK-trees is spell-checking and words. The area in which they are used most successfully is image deduplication. In that case the metric you're going to use is some form of perceptual hashing.
I think you are missing important case such as image data. Another one is floating point vectors with scaled up and rounded distance.