Hacker News new | ask | show | jobs
by bryanh 4421 days ago
Redis really is a fundamental building block for designing distributed systems these days. I was kind of surprised, but all these examples exist independently in the Zapier codebase as well (all backed by Redis).

I've been meaning to open source our timeseries implementation for a while now, it is very similar to the linked article but uses a "{key}:YYYY:MM:DD:hh:mm:ss" pattern on hashes where you pick your stored granularity and TTL for each time unit. For example: store second granularity "{key}:YYYY:MM:DD:hh:mm": {0-60: count} for 8 hours, minute granularity "{key}:YYYY:MM:DD:hh": {0-60: count} for 24 hours, hour granularity "{key}:YYYY:MM:DD": {0-24: count} for 3 days, the rest forever. Very similar to https://github.com/jimeh/redistat or other implementations.

Fun!

2 comments

Oooo, simmetrica looks very nice! IIRC when I was writing our implementation there weren't any solid Python versions yet. Glad to see that changing!

More good implementation info here http://blog.apiaxle.com/post/storing-near-realtime-stats-in-....

You just solved some of my problems. Thank you very much!

Btw is there any nodejs module for voting? I've done it myself for one app but it would be nice to see other solutions.

Hey Bryan,

just curious, are you guys not already using statsd/graphite? I think that at least you used to, since you contributed to my small script[0] to automate the installation of graphite... So I'm curious if/why it wasn't good enough, or whether this has different requirements that graphite wasn't suitable for?

[0]https://github.com/gingerlime/graphite-fabric

(Jumping in for Bryan; also @Zapier)

We installed statsd/graphite early on to experiment around with visualizing our task and request logs for Zaps. We've since settled into Elasticsearch and Graylog which is phenomenal for debugging and support -- but has it's growing pains.

The timeseries stuff is used more at the application layer, rather than the pure logging layer. For example, I believe we're using it to track how many tasks an account has done over the last 30 days for pricing/plans.

Yep! Redis tends to be more for reading inflight rate/plan limiting, something I've not heard a lot about in conjunction with graphite (though it might be great!). We might bring back statsd/graphite for alerting/monitoring in general though, we've been looking for solutions there.
Thanks (also to Mike) for sharing. I can't comment about your specific use-case, but I imagine that unless your latency requirements are very demanding, graphite can play nicely here. Its rather expressive querying allows you to aggregate timeseries data pretty easily. Of course, Graphite is primarily used for monitoring and trending, but it's not limited to this use-case. Alerting isn't something it's particularly great at though, but there are other tools that plug in there.

Not trying to diss redis, I'm using redis as well and love it, but was just curious since you mentioned time-series data.