|
> in a log database, the ordered log would be eg. 1<-2<-3<-5<-4<-6 How do you know that? It's not inferrable from the DAG. Is sequencing also provided "a layer up"? > if 4 and 6 are both a set operation on key a, it would mean that key a would have the value from 6 and the next write to key a would overwrite the value in a. Yes, I mean for all of my events to be reads and writes of the same key. And you've proven my point, I think: if the resolution of this causal tree is Last-Writer-Wins, 6-domiantes-4, and "key a [gets] the value from 6", then whichever poor user was operating on the 3-4 causal branch has lost their writes. This is a problem! If you claim to be a CRDT and offline-first or whatever, then as a user, I expect that the operations I make while I'm disconnected aren't just going to be destroyed when I reconnect, because someone else happened to be using a computer with a lexicographically superior hostname (or however you derive your vector clocks). And if you want to say something like, well, when the unlucky user reconnects and sees that their work has been overwritten, they can just look in the causal history, extract their edits, and re-apply them to the new root -- there's no reason for any of this complex machinery! You don't need CRDTs to just replicate a log of all operations. Of course, you also can't do any meaningful work with such a data structure as a foundation, because it immediately becomes unusably large. |
If it's possible to model operations in a commutative way (eg. instead of assigning values to keys one just stores differences), then the conflict resolution is mathematically proven, just apply all operations in whatever order, they're commutative, great. Of course it doesn't help with "real world data", but that's where mathematically we can use and oracle (the user, or whatever linearizer service we choose).