Hacker News new | ask | show | jobs
by amsha 2810 days ago
Here's the key sentence: "concrete implementations of CRDT in co-editors revealed key missing steps in CRDT literature." This paper may be correct for academic CRDTs but it is very wrong when looking at industry implementations.

My hunch is that because CRDTs are so much easier to grok than OT, engineers are empowered to make use case-specific improvements that aren't reflected in academic literature.

For example, the Logoot/LSEQ CRDTs have issues with concurrent insert interleaving; however, this can be solved by dedicating a bit-space in each ID solely for followup inserts. The "Inconsistent-Position-Integer-Ordering" problem is solved by comparing ids level-by-level instead of in one shot.

Fundamentally, CRDTs have strong strategic advantages over OT. Given a document size N and a document edit history H:

* CRDT size is O(N) where OT is O(H)

* CRDT updates are O(log N) where OT is O(H^2)

In any nontrivial document, H ≫ N. This means CRDTs have much better perf than OT. Additionally, the best CRDTs (like Logoot/LSEQ) don't require tombstones, garbage collection, or "quiescence." The complexity burden is far lower.

To top it off, CRDTs are offline-capable by default.

2 comments

Casual and blanket theoretical characterizations of OT and CRDT algorithms is exactly the kind of folly that the article is trying to address.

For example, it's a logical contradiction to begin the statement with "OT's complexity is _generally_ governed by editing history (H), i.e. O(H)" to conclude that "CRDTs have much better perf than OT", then adding a qualification that argument applies only for the specific off-line editing scenario (which btw is also not true).

It's also inaccurate to base CRDT's complexity on N - the document size, i.e. number of characters visible in the document. You need to include tombstones for the case of WOOT variants or use garbage collection for RGA (which then requires vector clocks). These nuances are described in sections 4.3.

OT's complexity is governed by O(c) where c is number of concurrent operations, period.

For off-line edits, the short answers is that OT's complexity is also not O(H), if H is the number of character edits, because you can easily apply compression. Now, the longer answer: there is pretty strong spatial locality in the distribution of edits over a document -- we don't sporadically and randomly add or delete characters around a document, but intuitively, most of the inserts will be adjacent characters (i.e. strings), and many of the deletes are over newly inserted strings. OT uses string-wise operations, hence it will compress n consecutive character insert ops down to a single string op. In addition you can compress delete edits over newly inserted content: i.e. [insert 'To be or not yoo be' at 0, delete 'y' at '14', delete 'o' at '14', insert 't' at '14'] => [insert 'To be or not to be']. This is basic idea behind "operation compression" which is what OT would use to support off-line, non-real-time, asynchronous editing, however you want to call it. There is a better example here (http://www3.ntu.edu.sg/home/czsun/projects/otfaq/#_Toc321146...).

These optimisations (whole-string operations, compression, etc.) apply equally to CRDTs. See for instance DOI 10.1145/2957276.2957300. See also the blanket optimisations studied by Carlos Baquero's group (which I doubt could carry over to OT because of the complexity of OT theory).
DOI 10.1145/2957276.2957300 is an attempt at using binary-tree variants to improve CRDT's searching performance bottleneck in converting external index position to internal object identifiers. We address these specific complexities in Section 4.2.1.

DOI 10.1145/2957276.2957300 doesn't deal with operation compression. If you believe it does, please point out where we might find it.

It is trivial state opinions like 'OT is complex', but much harder work to back it up with concrete evidence. I'd think folks will find the discussion more informative without the constant hand-waving.

Have you submitted the paper to a peer-reviewed venue? I've read the arXiv version already, and I'd be more than happy to volunteer reviewing it if that's not too late.
It appears you have specific comments and/or feedback? Either post them on HN or send them to the primary contact's email. Either way, we'll make an earnest effort to respond.

This discussion is of interest to folks both inside and outside of academia and the developer community brings valuable hands-on experiences to the table. So, let's keep it inclusive.

Let me prioritize the usual academic processes first to make sure the outcomes are more persistent and visible, given we are talking about a publication under review or a draft. I will reach out to the paper's primary contact.
The time complexity of most OT systems is not related to H.
I've been working in OT systems for years (G Wave, ShareJS, ShareDB, some other stuff). I'm consistently surprised by how badly academic papers predict OT systems will perform. In reality, they perform great. My little C implementation of text OT can handle about 20M text operation transforms / second[1].

Part of the gap is that many academic papers model text operations as just single character edits. If you do that, copy+pasting some text into an editor can inject thousands of insert operations into the system all at once. But that design is really sloppy - nobody actually designs real world OT systems like that. A much better way to design text OT operations uses a list of skip, insert and delete parts. That way you can arbitrarily compose edits together. This way an entire pasted string just gets inserted in one operation and performance is fine. (Or if the user goes offline, you can merge all the edits they do into a single change, then transform & commit it all at once).

I've still never seen the OT algorithms of anything I've worked on have a meaningful impact on performance. Actually thinking about it I'm not sure if I've ever even seen the OT code show up in profile traces.

[1] https://github.com/ottypes/libot

My personal experience is different. OT tends to be quadratic in terms of computational complexity and codebase size. As long as you consider a simple case, 1^2 is 1. Then, add more data types (not just text), faulty servers, faulty network, non trivial network architectures... All these problems tend to interact. They easily go combinatorial, but again 1! is 1. I worked with an OT codebase that had C++ templates within C defines and I edited them using regexes (x types of operations => x^2 types of potential transformations).

OT aside, data sync in general is not a solved problem. Maybe I am especially unlucky, but I have yet to see a chat app that has no sync glitches. A chat is not the most difficult data structure to sync, but I saw bugs in Slack, Telegram, Instagram, Gchat, Skype, others... WhatsApp does not even try to sync...

A lot depends on the diversity of your use cases and the number of the nines you expect from the system...

<- This, concurs with our experience.

When we profiled our string OT implementation on a single 2-3 GHz CPU, with 2-3 CPI (Cycle Per Instruction)), it was able to handle about 10M transformations per second. On a half dozen of OT based system I've worked on, the OT implementation was never where we found the real engineering complexity or the performance bottleneck in working coeditors.

The significance of using string-wise operation model in OT is under appreciated and can't be emphasized enough for real world systems. The first string-wise OT transformation functions can be traced back to Achieving Convergence, Causality Preservation, and Intention Preservation in Real-Time Cooperative Editing Systems, 1998. We also found that skip, insert, and delete can improve compression of string edits.

I fully agree, I don't understand the emphasis on performance both in the literature and in comments on the subject.

Additionally, it is more common to see CRDT libraries that make character-wise operations and create an object for each character (eg. y-js.org, github.com/google/ot-crdt-papers), which is not ideal.

Obviously there are also libraries such as Atom's Teletype that have string-wise operations.

The cynic in me feels like the CRDT vs. OT war misses the forest for the trees. What matters is lacking features, and the feature that is most needed and least described is a systematic way to offer a diff editor matching the normal experience. Indeed, after having been offline a while, one wants to see and select how their changes will integrate the shared resource.

I think you'll find the article supports your sentiment to an extent.

One of the key messages we tried to get across is that academic research for co-editing ought to go beyond theoretical analysis but rather drive research by pushing the envelope in new areas of application and/or features. OT's evolution has been symbiotic with new applications, while we haven't seen this in CRDT (for coediting) research. The reality is that you'll find new CRDT papers every year that still (a) make fairly broad claims of benefits are either theoretically dubious or not backed/validated by application/system implementations, (b) dwelling on technical issues that were resolve years ago. To move beyond this, we want to have put these issues in context (a general transformation approach) and dispel the most common fallacies in theoretical analysis.

Related to the 'visual-merging' feature you mentioned, when I worked on making MS Word collaborative as a research project, one of the UX research questions was how to allow users to visualize the different combined effects of updates to objects (e.g. if you changed the background of object X to red and I change it to yellow at the same time). We came up with a multi-versioning technique to display the potentially different combinations and help users select one that's the most desirable. It is definitely a interesting and challenging problem to consider for text documents.

My understanding of OT is that each change requires traveling backward though the edit history to find the first state that is consistent between sites, then traveling forward to transform the new change against past edits. In the worst case, an edit requires transformation against the document’s origin state.

Is that not the case?

Most OT systems only transform concurrent operations.
That's true. My performance claims are biased toward offline-capable editors, where the last consistent state between sites may be thousands or millions of edits ago. For online-only editors the set of edits between now and the most recent consistent state may be much smaller.