Hacker News new | ask | show | jobs
by bugzz 1713 days ago
The main use case I've always seen is to use bloom filters on the client side to reduce traffic to the server looking things up. As you said, you could cache 250 million integers in a gigabyte - but you don't want to bloat your client side implementation by a gigabyte for every bloom filter you use. Also, many times the items are a lot larger than an integer. For example, storing a list of malicious URLs, a common use case.
1 comments

I'm so confused by this use case (the traffic-saving one, not the malicious URL classifier). Why not store the "is-paying-customer" bit in a cookie?

What are we using as the user identifier? Where does it come from, if not a cookie?

Also, this client-side bloom filter kind of leaks your user database, supposing it's keyed on email addresses and your adversary has a gigantic list of email addresses, or is patient enough to enumerate them.

> Why not store the "is-paying-customer" bit in a cookie?

You shouldn't trust the client. You probably don't want people to get access to paid features with a relatively easy tweak of cookies.

The client-side filter is more suitable for listing, say, malware URLs, as mostly the response is "no" (go ahead) instead of "maybe", which would require a bit more work (like, network requests) to check if it's blocked or not.