Hacker News new | ask | show | jobs
by lars512 4993 days ago
The inconsistent reads in replica sets is something we've come across with MySQL read slaves as well. I think it's a gotcha of that whole model of replication, rather than a MongoDB-specific issue.
2 comments

I'm not aware of any database that solves this problem. Is there one? As far as I know, mysql reads must be distributed to the slaves at the application level, which has no knowledge of master/slave inconsistency. I suppose the time delta between master and slave can be queried, but that still doesn't protect from race conditions/inconsistent reads. This is actually why we chose to only utilize slaves for data redundancy rather than read throughput at my last company. Inconsistent reads weren't tolerable.
Riak does. You say, when writing, "please don't return until this data is replicated on 2 servers." And when reading, "please only return a successful read if this data is read from 2 servers."

So you have R = 2, W = 2, R+W = 4, and if your replication (N) val is 3, you're fine (you're always going to get consistency if R+W > N).

Riak is cool.

Riak is cool and what you have described is correct however under failure conditions[1] you may not get this desired behavior.

[1]http://docs.basho.com/riak/latest/references/appendices/conc...

I believe Cassandra does as well, not 100% sure though.
Cassandra does, you can write with a write consistency of W and read with a read consistency of R, and as long as their sum is greater than the replication factor (number of copies to store across the cluster) you have consistent reads. W + R > N.

http://wiki.apache.org/cassandra/ArchitectureOverview#line-1...

MongoDB has such feature (maybe its depends on driver, but at least JVM drivers have - http://api.mongodb.org/java/2.9.1/com/mongodb/WriteConcern.h...).

As for me, it's mostly the quesion of perfomance, and application architecture, most time you don't want to wait until it's replicated to slaves.

As far as I can tell, WriteConcerns don't protect from inconsistent reads in all cases. It looks like the most conservative setting is Majority, but even then there is no assurance that reads won't occur during the replication, nor that they won't occur to one of the minority of non-replicated servers.
You can set it to the total number of slaves you have and ensure data is on all of them. Normally that slows writes down enough that it's undesirable, though.
Does Riak support distributed transactions? If not, I don't see how they handle the possibility of a read occuring during the replicated write.
No, Riak is nontransactional. But neither is mongo if it's important here. Riak is apparently getting some kind of strong consistency though. Calvin looks interesting for a nosql with distributed transactions.
This is also something we desperately needed at my last company, but we couldn't find FOSS that supported it. mysql offers XA, but I've heard mixed reviews.
Shameless plug (hey Tokutek is doing it), in VoltDB replication is synchronous so it doesn't have this problem.

Latency in the current version is nothing to write home about, but in V3 latency with replication is 600-1000 microseconds. Group commit to disk is every 1-2 milliseconds.

V3 also allows reads to be load balanced across replicas and masters so you gain some additional read capacity from replication. V3 also routes transactions directly to the node with the data so you don't use capacity forwarding transactions inside the cluster.

You get to keep transactions to. Now go figure out what you don't get to keep ;-)

>Now go figure out what you don't get to keep ;-)

Cross-datacenter replication becomes a Really Bad Idea?

You don't have to give up cross DC replication if you do it asynchronously, but you lose cross shard consistency when there is a dirty fail over. This effects distributed transactions and series of single part transactions that depend on each other across different shards.

What Volt supports right now is actually asynchronous replication that does preserve cross shard consistency, but that is not going to last.

You can do synchronous multi-DC replication, but then you have Spanner and the associated latency of multiple data-center quorums.

There is also Calvin http://bit.ly/RGW9RY

Any platform that supports synchronous replication?

http://www.postgresql.org/docs/9.1/static/warm-standby.html#...

This is called semi-synchronous replication

http://dev.mysql.com/doc/refman/5.5/en/replication-semisync....

One way to resolve it is to mark that user or session (or even just request) "sticky to the master" for long enough to cover your normal replication delay.

When we saw it before, ensuring that a given request which issued a write also read from the master was sufficient. (sub-second replication delay).

This may help in the majority of the cases, but many applications also can't tolerate inconsistent reads across users/sessions.