Hacker News new | ask | show | jobs
by tomc1985 3267 days ago
Unrelated, but I've always had this question about bloom filters...

If testing for membership in the set is unreliable, but testing for non-membership in the set works every time, then why can't you simply invert the boolean for the non-membership test and call that a membership test?

E.g., if it's not not in the group, then it's in the group...

1 comments

If you want to invert the boolean check, then you need to invert the input space as well, which is not possible.

For example, consider a Bloom filter checking the availability of a username during sign up. If you want an inverse Bloom filter that checks if it is not not in the group, then you need to load it with all possible usernames.

Makes sense. OP's link is interactive enough that I was beginning to see that, though couldn't articulate it. Thanks!
If I was going to explain a bloom filter like you're 5... a bloom filter is like a savant who never forgets a face -- maybe he's got a job in passport control in Arstotzka -- if you show him someone's face or picture once, he'll never forget it: if any time later you show him the same picture and ask him "have you seen this face before?" he'll say "yes" without fail. If he replies "no way", you can be 100% sure he's never seen it. But his memory isn't photographic: he confuses people's faces after a while, recognizing faces he's never seen, and that becomes worse the more people he's seen. So to be safe he doesn't reply just "yes" but "hmm, I guess so".

In computing terms the interface has 2 methods:

- takeALookAtThisFace(x)

- yaSeenThisFaceBefore?(x) which returns either "no way" or "hmm, I guess so".

"no way" means P(x is a member) = 0. Its negation is not "hell yes" (P(x is a member) = 1), it's "maybe" (P(x is a member) > 0).

Thanks for this explanation it was really helpful!
Glad to help. But I noticed the analogy is a bit flawed; testing for membership does not add anything to the set, but the analogy might imply asking yaSeenThisFaceBefore(x) will make the savant remember x. I should change the story and the 2 functions to "remember this terrorist" and "is this a terrorist?" or something like that.