|
|
|
|
|
by cmurphycode
3188 days ago
|
|
Having many filters is worse, but not hugely so. I believe the problem is reducible to the fact that you'll have many smaller filters thus the random distribution hurts you a bit more. The key word to google is "blocked bloom filter" e.g. as proposed in http://algo2.iti.kit.edu/documents/cacheefficientbloomfilter... Here's a nice paper with some improvements http://tfk.mit.edu/pdf/bloom.pdf We use blocked bloom filters for a couple of reasons, but one major benefit is the memory locality (our "bloom filter" is 32GB or larger, so it's handy & fast to be able to address it with separate "pages" which are really just individual bloom filters.) |
|