| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by pjscott 5385 days ago

I think your description of redis' reliability is a bit too vague and scary. There are several levels of reliability that redis supports, each with its own tradeoffs:

1. No disk. Everything in memory, and if redis dies, so does your data. This is closest to memcached.

2. In memory, with periodic background flushes to disk. After a timeout (shorter if there's a lot of modification to your data), redis will spawn a background thread and write out all its data to a file in the background. (Then it will atomically rename the file in the place of the previous dump file.) This is the easiest form of disk persistence, and good enough for most of what people use redis for.

3. You can also configure redis to write to an append-only file that logs every operation, which you can periodically compact with a cron job or something. The flushing interval is configurable, and makes a big difference in speed. This is not particularly convenient -- who the hell wants to write a cron job to compact database logs? -- but it gives you durability on par with a conventional database.

4. If you have another machine lying around, there's an option that you can combine with any of the three options above: master-slave replication. A read-only slave redis server receives a stream of all writes to the master, and changes itself to match. This gives a small data-loss window in the event of a master failure, and makes fail-over possible. If the master goes down, you can have a slave become the new master. Coordinating this can be tricky, but it can certainly be done.

tl;dr: If the reliability approaches above look good enough for your application, and redis looks like the best match from a semantic or performance standpoint, then go for it!

1 comments

redbad 5385 days ago

I had temporarily forgotten about master-slave replication. Thanks for the reminder. And I appreciate your broad points. But I still maintain that you're fooling yourself if you think any of those options is a stand-in for Proper™ database-style persistence. By that I mean some contractual assurance in the software that a change written to a data structure is permanent, to standards comparable to the underlying storage system.

If Redis were a database, I would expect a successful HMSET to generally be available in perpetuity, even if the machine was rebooted immediately after I got the "OK".

Append-only command logs don't solve the problem; replaying a complex series of state transitions is not a viable substitute for the storage of the end result of those transitions. It's computationally correct, and a straightforward implementation, but quickly and easily becomes unacceptably inefficient. I hope that's clear enough that I don't need to provide an example.

Replication is a solution for the problem of network or machine instability, assuming a valid Redis use-case. It doesn't address persistence in the sense of databases. In distributed computing, High-Availability is orthogonal to Persistence.

Periodic background flushes to disk in another thread come the closest to solving the persistence problem, but to get database-class QoS you'd need to trigger on every change (or, say, every second). Obviously this is a bad idea and not what the feature is designed for, which circles back to my main point: Redis is not designed to be a database. If it's operating as a memcached replacement in your stack, great. If it's standing in for authoritative, long-term storage of critical data, it's being misused.

pjscott 5385 days ago

Let's compare Redis's AOF mode with InnoDB. The way that InnoDB manages to give this guarantee is by flushing its log to disk on every transaction commit. If you're willing to sacrifice some durability for write speed, this can be relaxed. In redis, the closest equivalent to this would be running in AOF mode with a flush on every write.

The difference here is not one of durability, but in how the data is stored on disk. InnoDB keeps the logs small by periodically updating a B-tree with the changes in the logs, after which those changes can safely be removed from the logs. The result of this is strong durability, a reasonably compact on-disk representation, and fairly fast recovery when someone trips over the power cord.

Redis, in AOF mode, logs every command to the log file and (if you specify it in the config file) flushes to disk after every write. The problem is that this file grows without bound: if you leave redis running forever, it will eventually fill up your hard drive, and recovering from a restart will take way too damn long if you have to replay a 1 TB log file. The conventional way of dealing with this is to periodically use the BGWRITEAOF command, which does essentially the same thing as a background data dump: it writes out a new AOF file from the current contents of redis in memory, and deletes the old AOF file. This is roughly equivalent to augmenting the usual periodic-data-dump behavior of redis with periodically-flushed logs, just like a more conventional database.

If there's something I'm missing here, I'd love to hear it.