Hacker News new | ask | show | jobs
by tlack 5141 days ago
It's useful if you have data that is easy to cache (i.e., rebuilt every 6 hours) but very commonly accessed. Because the lookups are so quick (two seeks) it's almost raw disk speed. But yeah, rebuilding the files is an offline process (build new file and swap it in using a rename), so your data has to be cache-friendly.

It's a good alternative to memcache if your data is larger than what memcached can support in RAM.

In the early 2000s I used it to implement most of the frontend for a PPC marketplace for search engines. Held up well. These days I'd just use memcached or redis.

1 comments

It's a good alternative to memcache if your data is larger than what memcached can support in RAM.

Unless you are running memcached on an ec2 large instance (8gb) or bigger:

"No random limits: cdb can handle any database up to 4 gigabytes."

It's pretty easy to throw together a variant using longs for position instead of unsigned ints, the rest of the code stays the same. Slightly more overhead in the file but as long as the items you're storing are bigger than a few bytes it's not a huge deal.

Anyways, it's useful for stuff where you want to ship out a big dictionary once a day or so and you need fast lookup but it doesn't have to be updated transactionally.