| Let me see if I understand this correctly: CRDTs take an editor event such as "insert at position X" and turns it into something a concrete operation like "insert to the right of node Y created by client C" which is then sent. This makes it super easy to apply concurrent operations since they have a direct reference to where they're located. However, it also means that you know have to keep track of these nodes. All of the nodes that has ever existed is present at all times, and deletion is handled as a state flag which marks it as hidden. OTs take an editor event such as "insert at position X" and keeps it like that. Whenever a concurrent event is received it then tries to "rebase" it (i.e. patch that event) so that it makes sense on top of the current events. However, this (1) can be quite finicky to get right and (2) it is based on there being One True Order of Events (i.e. a server). This approach takes an editor event such as "insert at position X" and keeps it like that. When applied, it can be inserted into an "ever-growing list of atoms with state flag". However, in this algorithm the data structure is actually capable of representing two different versions at the same time: A current version and a final version. This is handled by there being two state flags instead of one: Every node has a "current state = exists/deleted" and "final state = exists/deleted". This gives us the power of doing a "soft undo" (which is called "retreat" in the paper): We can take our own latest event which we've applied, revert the effect on the current version, while still keeping the final version the same. We're handling this very similar to CRDTs: We keep all the nodes at all time, we're just using state flags to keep track of whether it exists or not. This is useful when we observe a concurrent event. This event have references to positions which are valid in the context of its parent. If we "retreat" of all of our events until we reach the parent, we then have a data structure which represents the text at that point. We can now apply the "insert at position Y"-events which we received by interpreting "Y" in terms of the current version. After we've applied all of those events we can then look at the final version and this now actually contains the combined result of both changes! And here comes the nice part: Since the events themselves are always on the form "insert at position X" it means that we can choose another representation of applying them. For instance, if we know that there are no concurrent events that are happening, we might as well apply them directly on a string without bothering with the whole "current/final dual data structure". |
> OTs take an event..
This is how the early Jupiter OT works, yes. And most OT systems work like this. But there are also some papers on more recent OT systems which can work with more than 2 peers. Unfortunately, many of these systems have turned out to have convergence bugs and/or they are O(n^2). For our paper one of our example datasets takes tens of milliseconds to replay with CRDTs and egwalker but an hour of time with OT!
> the data structure is capable of representing two different versions…
With egwalker it’s important to distinguish between two different data structures. There’s a grow only set of original editing events. This is persisted to disk and replicated over the network. Then while actually replaying events or merging, we generate a second, temporary, in memory data structure which resembles a normal CRDT. (Except with an extra state field on each item like you said). This crdt state object isn’t persisted. It’s usually discarded as soon as the merge (transform) operation is complete. One big advantage of this approach is that this data structure does not need to represent all items ever inserted. Just the concurrent items, back to the most recent common branch. So it’s usually tiny. And that allows history to be pruned - which CRDTs typically don’t allow.
But yes, everything else is right!