|
Sure! CRDTs are often presented as a way of building asynchronous replication system with eventual consistency. This means that when modifying a CRDT object, you just update your local copy, and then synchronize in background with other nodes. However this is not the only way of using CRDTs. At there core, CRDTs are a way to resolve conflicts between different versions of an object without coordination: when all of the different states of the object that exist in the network become known to a node, it applies a local procedure known as a merge that produces a deterministic outcome: all nodes do this, and once they have all done it, they are all in the same state. In that way, nodes do not need to coordinate before-hand when doing modifictations, in the sense where they do not need to run a two-phase commit protocol that ensures that operations are applied one after the other in a specific order that is replicated identically at all nodes. (This is the problem of consensus which is theoretically much harder to solve from a distributed system's theory perspective, as well as from an implementation perspective). In Garage, we have a bit of a special way of using CRDTs. To simplify a bit, each file stored in Garage is a CRDT that is replicated on three known nodes (deterministically decided). While these three nodes could synchronize in the background when an update is made, this would mean two things that we don't want: 1/ when a write is made, it would be written only on one node, so if that node crashes before it had a chance to synchronize with other nodes, data would be lost; 2/ reading from a node wouldn't necessarily ensure that you have the last version of the data, therefore the system is not read-after-write consistent. To fix this, we add a simple synchronization system based on read/write quorums to our CRDT system. More precisely, when updating a CRDT, we wait for the value to be known to at least two of the three responsible nodes before returning OK, which allows us to tolerate one node failure while always ensuring durability of stored data. Further when performing a read, we ask for their current state of the CRDT to at least two of the three nodes: this ensures that at least one of the two will know about the last version that was written (due to the intersection of the quorums being non-empty), making the system read-after-write consistent. These are basically the same principles that are applied in CRDT databases such as Riak. |
So an update goes to node A, but not to B and C. Meanwhile, the connection to the client may be disrupted, so the client doesn't know the fate of the update. If you're unlucky here, a subsequent read will ask B and C for data, but the newest data is actually on A. Right?
I assume there's some kind of async replication between the nodes to ensure that B and C eventually catch up, but you do have an inconsistency there.
You also say there is no async replication, but surely there must be some, since by definition there is a quorum, and updates aren't hitting all of the nodes.
I understand that CRDTs make it easier to order updates, which solves part of consistent replication, but you still need a consistent view of the data, which is something Paxos, Raft, etc. solve, but CRDTs separated across multiple nodes don't automatically give you that, unless I am missing something. You need more than one node in order to figure out what the newest version is, assuming the client needs perfect consistency.