Hacker News new | ask | show | jobs
by markbnj 3938 days ago
I agree with all the main points here: caching adds a significantly complex component to the system. You should only do it if you absolutely must pull data closer to a consumer. Adding caching "to pick up quick wins" is always dumb.

With that in mind, I do think most of the pitfalls listed here can be avoided with well-understood tools and techniques. There's no real need to be running your cache in-process with your GC'd implementation language. Cache refilling can be a complex challenge for large scale sites, but I expect that a majority of systems can live with slower responses while the cache refills organically from traffic.

The points about testing and reproducible behavior are dead on - no equivocation needed there. As always keeping it as simple as possible should be a priority.

1 comments

  There's no real need to be running your cache in-process with your GC'd implementation language.
Fundamentally there's no need, but in-memory caching may still be the right choice. As always, there are tradeoffs. Standing up a separate cache component incurs non-trivial costs. Your service now has a new "unit of management" - a new thing you need to deploy, monitor, and scale. It's a separate thing which might go down unless it's provisioned for sufficient load, and you need to be careful about unwittingly introducing a new bottleneck or failure mode in your system. These are all solvable problems, but solving them comes at a cost.

You can totally argue that engineers should be forced to think about and address these issues up front with more rigor, and in a perfect world I think I'd agree. :)