|
|
|
|
|
by snissn
3488 days ago
|
|
I've been running similar algorithms in production for around 4 years. One issue we've never solved wonderfully are unit tests on our bloom filter algorithms, we've used end to end tests to our satisfaction but not really tests of the bloom filter algorithms itself. Do you have some test / benchmark code that you could recommend for testing different bloom filter false positive rates? |
|
Or you could just use Guava's Bloom ;)
As for probabilistic testing of fp rate... The problem is that every once in a while a test will fail.
Disclaimer: I wrote a cuckoo filter library.
If you want to test fp rate check my cuckoo filter test at sanityOverFillFilter() in github.com/MGunlogson/CuckooFilter4J/blob/master/src/test/java/com/github/mgunlogson/cuckoofilter4j/TestCuckooFilter.java
The unit test basically does fuzzing, filling the filter repeatedly in the hopes that one day any errors will surface. The error bounds are pretty large but small enough to detect any egregious failures. Importantly, my filter defaults to a random seed. Guava DOES NOT, so any tests using the same items will be deterministic. The guava filters use this property to verify some filters that have been manually determined to be correct