|
|
|
|
|
by armon
4417 days ago
|
|
My previous job was at an advertising firm, and we used HyperLogLogs for almost all of our real-time analytics infrastructure. They are incredibly space and time efficient. Each "counter" fits into about a single page of memory, and can count into the trillions with <2% error. We developed an extremely high performance server around it (hlld): https://github.com/armon/hlld. We were typically hitting with with tens of thousands of requests per second across about 50K counters. Although it was benchmarked to >1MM ops a second. Similarly, we also make bloomd, which is an equivalent for using bloom filters, which provide a more set-like abstraction: https://github.com/armon/bloomd |
|