Hacker News new | ask | show | jobs
by devj 2266 days ago
Sorry.. Forgot to mention that.

Yes insertion/deletion is required and the frequency may be higher. The exact usecase is that group membership is dependent on some conditions. We schedule this check every 15 minutes and based on it, we are adding/deleting the members.

3 comments

Xor filters require all the members of the set be provided up front.

Bloom filters allow adding members, but not removing them.

Cuckoo filters allow removing members: https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf

And just for completeness, storing everything losslessly in a btree, radix tree, or hash table would probably be under 300 kB. (80 kB just for the IDs plus, say, an additional ~2x overhead.)

Safe removal is only possible if you know it is already in filter. The 'you can delete from cuckoo filters' bit I feel is often oversold.
Oh wow, I only read the paper's abstract and was fooled. Thanks for the correction.
You can use a counting Bloom Filter that allows deleting. Depending on your update requirements you might also be ok with two bloom filters, one for that has the group memberships and one that has group removals.
10k members every 15 minutes? Just rebuild it from scratch.
Or use a Cuckoo fiter.