| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nused 2811 days ago

Casual and blanket theoretical characterizations of OT and CRDT algorithms is exactly the kind of folly that the article is trying to address.

For example, it's a logical contradiction to begin the statement with "OT's complexity is _generally_ governed by editing history (H), i.e. O(H)" to conclude that "CRDTs have much better perf than OT", then adding a qualification that argument applies only for the specific off-line editing scenario (which btw is also not true).

It's also inaccurate to base CRDT's complexity on N - the document size, i.e. number of characters visible in the document. You need to include tombstones for the case of WOOT variants or use garbage collection for RGA (which then requires vector clocks). These nuances are described in sections 4.3.

OT's complexity is governed by O(c) where c is number of concurrent operations, period.

For off-line edits, the short answers is that OT's complexity is also not O(H), if H is the number of character edits, because you can easily apply compression. Now, the longer answer: there is pretty strong spatial locality in the distribution of edits over a document -- we don't sporadically and randomly add or delete characters around a document, but intuitively, most of the inserts will be adjacent characters (i.e. strings), and many of the deletes are over newly inserted strings. OT uses string-wise operations, hence it will compress n consecutive character insert ops down to a single string op. In addition you can compress delete edits over newly inserted content: i.e. [insert 'To be or not yoo be' at 0, delete 'y' at '14', delete 'o' at '14', insert 't' at '14'] => [insert 'To be or not to be']. This is basic idea behind "operation compression" which is what OT would use to support off-line, non-real-time, asynchronous editing, however you want to call it. There is a better example here (http://www3.ntu.edu.sg/home/czsun/projects/otfaq/#_Toc321146...).

1 comments

marc_shapiro 2804 days ago

These optimisations (whole-string operations, compression, etc.) apply equally to CRDTs. See for instance DOI 10.1145/2957276.2957300. See also the blanket optimisations studied by Carlos Baquero's group (which I doubt could carry over to OT because of the complexity of OT theory).

link

nused 2800 days ago

DOI 10.1145/2957276.2957300 is an attempt at using binary-tree variants to improve CRDT's searching performance bottleneck in converting external index position to internal object identifiers. We address these specific complexities in Section 4.2.1.

DOI 10.1145/2957276.2957300 doesn't deal with operation compression. If you believe it does, please point out where we might find it.

It is trivial state opinions like 'OT is complex', but much harder work to back it up with concrete evidence. I'd think folks will find the discussion more informative without the constant hand-waving.

link

zawirski 2788 days ago

Have you submitted the paper to a peer-reviewed venue? I've read the arXiv version already, and I'd be more than happy to volunteer reviewing it if that's not too late.

link

nused 2783 days ago

It appears you have specific comments and/or feedback? Either post them on HN or send them to the primary contact's email. Either way, we'll make an earnest effort to respond.

This discussion is of interest to folks both inside and outside of academia and the developer community brings valuable hands-on experiences to the table. So, let's keep it inclusive.

link

zawirski 2771 days ago

Let me prioritize the usual academic processes first to make sure the outcomes are more persistent and visible, given we are talking about a publication under review or a draft. I will reach out to the paper's primary contact.

link