Hacker News new | ask | show | jobs
by mohaps 2913 days ago
To add to Vlad's comment above: The major win was in being able to create a simple API which inherently computes update -> cache invalidation probability taking into account the semantics of the item being cached and its relevance to the query. To be able to do this explicitly via heuristics leads to bespoke effort for each new and modified query.

With Spiral, we were able to approach this top down as a classification problem.

e.g.

If you have a cached query for "Friends that liked my post", the Spiral classifier quickly learns that "Post Last Viewed At" or "Post Last Modified At" is not relevant to this via the feedback from the caching code.

Pre-spiral, this was expressed via a curated blacklist/whitelist which had to be recreated if the query characteristics changed.

1 comments

I think this makes sense. My mental model would have been viewing this as a sharding style concern. Where different sized and topic items would be to effectively different caches. With hit rates determining the size of each cache.

That said, I think I see where I was mistaken in thinking that was an odd example. It was literally the example. Not just a random pedagogical one.

To that end, thanks for sharing! Cool stuff.