|
|
|
|
|
by SomewhatLikely
621 days ago
|
|
My first thought upon seeing the prompt: If you would build an in-memory cache, how would you do it?
It should have good performance and be able to hold many entries.
Reads are more common than writes. I know how I would do it already,
but I’m curious about your approach.
Was to add this requirement since it comes up so often: Let's assume that keys accessed follow a power law, so some keys get
accessed very frequently and we would like them to have the fastest
retrieval of all.
I'm not sure if there are any efficient tweaks to hash tables or b-trees that might help with this additional requirement. Obviously we could make a hash table take way more space than needed to reduce collisions, but with a decent load factor is the answer to just swap frequently accessed keys to the beginning of their probe chain? How do we know it's frequently accessed? Count-Min sketch?Even with that tweak, the hottest keys will still be scattered around memory. Wouldn't it be best if their entries could fit into fewer pages? So, maybe a much smaller "hot" table containing say the 1,000 most accessed keys. We still want a high load factor to maximize the use of cache pages so perhaps perfect hashing? |
|
Why?
* it works
* it's available now
* it scales
* it's capable of HA
* it has bindings for every language you probably want to use
Why bother writing your own cache, unless it's for an exercise? Cache management is complicated and error prone. Unless the roundtrip kills you just use redis (or memcached).