|
(edit: post author, [not OP]) here. Thanks for the feedback! > I guess one of the key insights is that each data has a canonical server owner which enforces the consistency of the writes of the data at a single place. Well, this is the way I presented it, because it's easiest to understand. But, if you want a replicated system that provides HA, there are alternatives (http://www.bailis.org/blog/non-blocking-transactional-atomic...). > When a third client3 tries to read the latest value of x or y, what is the latest value of its peer data? It looks like depending which data client3 starts with, it would get a different version of the peer data? ... Am I missing something or this is the semantic in determining the latest values of peer data? Good question! This ultimately comes down to how you want to handle concurrent writes. Many distributed databases use a "last writer wins" strategy when reconciling concurrent updates (that is, the correct behavior is specified to be that the database serves the highest-timestamped version of a given data item). Now, in your example, the clients both started their writes at the same (real-world) time and used this time as a basis for their timestamp, so the "last" write is undefined. We need a way to break the "last writer wins" tie. In practice, this can be something like the client ID appended to the last few bits of the timestamp or even a hash of the value written--as long as the tie-breaker is deterministic (that is, different replicas don't decide different "winners"), it doesn't really matter which is chosen. In practice, to avoid storing every value ever written, you'd want to provide some kind of "merge" function for multiple writes, and, in our implementation and in often practice, this is last writer wins (plus some deterministic tie-breaker for identically timestamped but distinct writes). |
The only problem that bogs me a bit in this algorithm is that transaction time sequence is not guarantied and preserved. Some clients may see x=y=10 and others x=y=20 during the transaction period. This happen if one client starts writing x=y=10 with s1 and the other y=x=20 with s2. Clients requesting x and y starting with s1 will get x=y=10 and those requesting x and y starting with s2 will get x=y=20. When the transactions completes, the values finally stored in x and y as good may be 10 or 20.
So the algorithm ensures consistency, which is the most important and useful property, but with many concurrent write transactions, the ending DB content may be a bit somehow unpredictable.
This can be a problem when one needs to do operations like x-=1 and y-=1 on the database as for seats reservation in a plane for a travel with two flights for instance. How would this be done ?