Hacker News new | ask | show | jobs
by teraflop 4398 days ago
Well, that's a matter of terminology. It uses quorum replication, so it can make progress as long as a majority of replicas are online and communicating. I would consider that "proper" replication in the sense of a replicated state machine.

You're right that it's different from, say, master/slave replication in an SQL database. There's no distinction between an authoritative master and a slave that provides stale data. Each machine either gives you consistent reads and writes, or is unavailable.

As far as latency goes, the gory details are in the design document. You need to talk to at least N/2 other replicas; there's no way around that without giving up consistency. But that doesn't mean you can only do one transaction every 50ms; they can be pipelined, and non-conflicting transactions can proceed simultaneously.

1 comments

Okay, so there will be conflicts, which brings us back to the original question.

>I would consider that "proper" replication in the sense of a replicated state machine.

When I think about proper replication, I'm thinking about master-master replication which doesn't fail if the connection between peers is sometimes down, even for very long periods (e.g. what CouchDB can handle). I'm of course not saying that other kinds of replications are somehow inherently bad, but multi-master replication without active connections is what I'm after and what a lot of modern applications can benefit from.

Once you have two databases that are not connected all the time you need to handle conflicts. You can move the conflict handling totally to the client side, but the conflict handling must be implemented somewhere. I think that's such a common use-case that the database should provide basic interfaces and implementation for it. If nothing else, it reduces boilerplate code by large amounts. Of course no database can handle conflict handling fully, as some of it always depends on the business domain.