Hacker News new | ask | show | jobs
by simonw 5597 days ago
I really like the idea of pushing log messages in to a redis list and then flushing them out to disk with another process.

I've often thought it would be useful to have a redis equivalent of MongoDB's capped collections, specifically to make things like recent activity logs easier to implement. At the moment you can simulate it with an rpush followed by an ltrim, but it would be nice if using two commands wasn't necessary.

2 comments

Hello Simon,

sending LPUSH+LTRIM in a pipeline is the same as having a special command for this. But having a special command for this, and for other use cases, makes Redis somewhat less general. What I mean is that if we consider every added feature a cost (complexity cost, not development cost), why don't instead add a feature that allows for a use case currently not covered?

Btw there is an interesting pattern so you actually need to rarely send the LTRIM. Imagine this: you want a list to save the user timeline, you are interested only in the latest 100 messages. So for every entry you can LPUSH+LTRIM. But after all you can just LTRIM 10% of the times. Your list will fluctuate in length between 100 and 110, but as you access things using LRANGE the additional elements wil not create any problem. So the cost of the LTRIM, while already very very small, can be made 90% smaller with this simple trick.

I use this exact pattern at Boxcar to keep some lists short and it works great for us.
Thank you for reporting a "live" usage of this pattern!
I don't know...that part sounded like a hacked-up, half-implementation of scribe: https://github.com/facebook/scribe/wiki

I'd be interested in hearing if they tried to use Scribe for the same task and found it wanting, or if there's some other story.

Could you say more about why Bump's implementation of network-based queued logging is "hacked-up" while facebook's (by implication) isn't?

To answer your question, simply put, no one here had heard about Scribe.

"Could you say more about why Bump's implementation of network-based queued logging is "hacked-up" while facebook's (by implication) isn't?"

Well, mainly because Scribe was purpose-built to do log aggregation on a large scale, and has nice features to prevent data loss in the event of network and node failure. It's also pretty well-tested at this point, given its origins and community. Check the wiki to which I linked.

I didn't mean my comment to imply anything negative. I was just trying to point out to the parent comment that there's now a better option than rolling a custom log aggregator on top of Redis. That may not have been true when you started your system. Mea culpa.

scribe is a very powerful logging tool, but it also comes with its dependency costs. Compiling boost, thrift, fb303, and all the scribe logging libraries as well. If you are already a thrift shop, it can make a lot of sense, but otherwise, there is a lot of legwork to get it up and running.
Compiling a few libraries is worse than rolling your own log aggregation service? Really?

I'd rather spend a day compiling libraries than spend a week re-writing a piece of basic infrastructure.