|
|
|
|
|
by hinkley
229 days ago
|
|
I wonder how often in the wild people are tuning for a 1% false positive rate versus a much lower one, like .1%. You do quickly reach data set sizes where even 1% introduces some strain on resources or responsiveness. Cuckoo claims 70% of the size of bloom for the same error rate, and the space is logarithmic to the error rate. Looks like about 6.6 bits per record versus 9.56 bits for bloom at 1%. But at .5% error rate a cuckoo is 7.6 bpr. In fact you can get to about a .13% error rate for a cuckoo only a hair larger than the equivalent bloom filter (n^9.567 = 758.5) |
|
My fairly niche use case for these kinds of data structures was hardware firewalls running mostly on SRAM, which needed a sub one-in-a-billion false positive rate.