Hacker News new | ask | show | jobs
by jdp 4670 days ago
Another cool project is bam[1], a constant key/value server built on similar principles. A single input file (in this case, a TSV file instead of the SPL log file) and an index file. The cool thing about bam is that it uses the CMPH[2] library to generate a minimal perfect hash function over the keys in the input file before putting them in the index file.

[1]: https://github.com/StefanKarpinski/bam [2]: http://cmph.sourceforge.net/

1 comments

Wow, didn't expect this to make a mention on the front page of HN today. I never did convince Etsy to let me deploy bam in production, but it's so simple that it should be doable without much fuss. I mainly built it as a proof-of-concept to show that serving static data does not have to be difficult – and that loading large static data sets into a relational database is a truly wasteful, terrible approach. Are you actually using bam "in anger"?
Bam looks really interesting, definitely a lot simpler than Sparkey, and the basic principle is the same. I have been hesitant to use perfect hashing for Sparkey since I wasn't sure how well it holds up for really large data sets (close to a billion keys). Impressive to write it in less than 300 lines of clean code!