Hacker News new | ask | show | jobs
by ScottBurson 3172 days ago
It's correct; you've misread it — in fact, you're agreeing with it. What it's saying is, if the item is in the set, the Bloom filter is guaranteed to report that it is in the set. What you're saying is, if the filter says the item is not in the set, then it is guaranteed not to be in the set. Those two statements are equivalent (being contrapositives: "A implies B" is equivalent to "not B implies not A").
2 comments

The article states a Bloom filter is "guaranteed to report correctly if it contains the item", however a Bloom filter cannot do this. The bits set by the various hash functions could very well be set due to some other key. What the Bloom filter _can_ say, is that if none of the bits are set, then clearly the key was never inserted.
> The article states a Bloom filter is "guaranteed to report correctly if it contains the item", however a Bloom filter cannot do this.

I think what's going on here is that you're reading "if" as meaning "whether". In fairness, this is common English usage: "I'll tell you tomorrow if I'm going" usually means "I'll tell you tomorrow whether I'm going".

That's not what the OP means. The correct reading is perhaps clarified with a comma:

> A Bloom filter is guaranteed to report correctly, if it contains the item.

or perhaps better

> If it contains the item, a Bloom filter is guaranteed to report correctly.

Ah yes, that would be it.

IMO it is better to present the Bloom filter by asking/stating what one can be certain of given each return value of the Bloom filter.

I think it might just be a strangely formed sentence where the part "if it contains the item" is not the subject of the report, but a condition:

If it contains the item -> reports that it contains the item (correctly)

If it doesn't contain the item -> may report that it contains the item (incorrectly)

Nothing wrong with being pedantic when dealing with definitions.
Since we're on the topic of being pedantic, "well, you know, that's just, like... your opinion, man". I'd argue that there's nothing excessive about arguing if A may, in fact, be the opposite of A :)
Certainly not, I'm all for some arguing on the internet.
The logic you wrote is exactly the _opposite_ of what a Bloom filter does.
"If the item is in the set, the Bloom filter is guaranteed to report that it is in the set." This statement while correct is not useful since the query is against the Bloom filter, not the set. The Bloom filter reporting the object's hash key is in the filter does NOT mean the object is in the set. Multiple objects can have the same hashkey. A check of the hashkey on the Bloom filter just means ONE of these objects is in the set. The existence of a hashkey in the Bloom filter does not give much useful information.