Hacker News new | ask | show | jobs
by vidarh 4264 days ago
> . This doesn't describe a lot of use cases.

It describes a huge proportion of the data I've ever worked with, because there are massive amount of use cases where what you are working with does not have to be (and as the systems scale: rarely is) the single source of truth for that data.

Redis beats memcached the moment you need more than a bunch of blobs.

E.g. our in-house capacity monitoring uses Redis for a ~1 hour time series view of our systems. We don't care about data loss - the odds of losing the data within an hour of an outage we need to care about is small enough to be justified and if that ever becomes a concern we can run two and split our updates between them. For long term storage, we migrate rolled up data to couchdb at the moment (doesn't matter what - we could use anything really; it's rarely queried and basically to let us get an occasional longer historical view to budget for resources etc).

We don't particularly care about integrity as long as it's "right most of the time" because the data gets constantly corrected, and precision in the averaged longer term numbers is not particularly important either (I want to be able to project when we run out of disk, for example - I don't care if disk usage is at 96% or 98%, as both are way too close for comfort).

Yet memcached is less attractive because it means far more book-keeping in the app. With Redis we encode part of the information in the keys (keys indicate which level of roll-up the data is at, the name of the data item, and the starting timestamp of the period the data is for) in a way that lets us easily retrieve a list of keys to do the roll-up. With Redis' Lua support we could probably do even more of the roll-up process in Redis itself, but I haven't looked at that yet.

There are a lot of apps like this, where your data volume makes hitting disk annoying (this replaced a Graphite based system - Graphite was thoroughly trashing an expensive disk array and still regularly getting too slow to us; the Redis based system uses ~10% of a much slower machine) and where the cost of losing some time window worth of data is low enough not to matter because it just means a blip in a data stream that is rapidly obsoleting itself.