Hacker News new | ask | show | jobs
by StavrosK 4573 days ago
Huh, very nice idea! That should, indeed save a ton of space and be much simpler when searching! I'll try that now, thank you.

EDIT: Hmm, turns out it's pretty much the same size, which makes sense, I guess: http://nbviewer.ipython.org/gist/skorokithakis/0abbfebced25f...

1 comments

The space savings for the same error rate should be small (I think the likelihood of false positives for a given load goes down slightly with size of the filter) but the benefit in lookup time should be significant for multiword searches. Thinking about it more though if you're doing live search you'll already have computed the results for the first word by the time you are given a second so maybe it doesn't matter.
I think it'll be faster to do multiple filters, because the one-filter way requires hashing and comparing N times while the multiple-filter way requires hashing once and comparing N times.
Oops, you are correct.