Hacker News new | ask | show | jobs
by NovaX 2684 days ago
That certainly helps, though you assume strict LRU vs approximate ARC. One can of course use similar schemes on LRU (like memcached does) or sampling LRU, making it approximate but reducing the lock contention. The technique that I've used is to buffer and replay events against the policy, so that reads are just a CAS into a ring buffer and replaying is under performed under a non-blocking lock. That works well with any policy and extends to other features, like expiration.
1 comments

I think we'd definitely have things to discuss :-) It sounds like we agree there is rarely a good reason to maintain strict LRU property no matter what scheme or structure is being used.

BTW for expiration I have simply invalidated on next access. This seems to have worked well for me with DNS caching, for example, where basically you have three hot records (gmail.com, hotmail.com, yahoo.com) that expire every 5 seconds. In this case of course there is value in proactively refreshing cache entries with some probability proportional to the expiry time so you don't have thundering herds.

Yes, that would be fun. :)

For fixed expiration (e.g. 5 min TTL), I use time-bounded LRU queues. Then if the head item has not expired, its successors have not either. If it has, it is merely a poll.

For custom expiration (varies per-entry), I use a hierarchical timer wheel. This replaces sorting with hashing, with an amortized O(1) cost.

For refresh, that's when I do it on access since its being used. I keep a refresh timestamp and an expiration timestamp, so that a refresh can occur while the data is valid but aging, but a blocking load is required when its too stale (expired).

You might enjoy this slide deck of the ideas that I use: https://docs.google.com/presentation/d/1NlDxyXsUG1qlVHMl4vsU...