| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shunza 1072 days ago

A comment from "https://news.ycombinator.com/item?id=36208585"

A Critical Examination of "The Art of the Fugue" Paper in Relation to OT The introductory paragraph of the paper's abstract states:

"Existing algorithms for replicated lists, which are widely used in collaborative text editors, suffer from a problem: when two users concurrently insert text at the same position in the document, the merged outcome may interleave the inserted text passages, resulting in corrupted and potentially unreadable text. The problem has gone unnoticed for decades, and it affects both CRDTs and Operational Transformation."

The issue of text interleaving in certain CRDT solutions, such as Logoot, is not a recent or overlooked problem. In fact, it has been independently reported by several researchers and practitioners as early as 2018. For detailed visual representations, please refer to Figure 2 and Section 4.4 in [1]. An earlier version of this paper was also published in 2018 and can be accessed at https://arxiv.org/abs/1810.02137.

What sets "The Art of the Fugue" paper apart is its identification of the interleaving problem in several early OT solutions, namely JUPITER OT [1994], adOPTed control algorithm [1996], GOT control algorithm [1998], and TTF functions [2006]. However, even if we were to accept the reported interleaving problem in these co-editing solutions as valid (which is not the case, as later shown), it would be highly unjustifiable to make sweeping claims such as "existing algorithms for replicated lists, which are widely used in collaborative text editors,” suffer from the interleaving problem, or "all text collaboration algorithms have an interleaving problem” (claimed elsewhere by a co-author). This is due to the existence of numerous other co-editing solutions not covered in the paper, including GOTO (ACM CSCW1998), NICE (ACM CSCW2002), TIBOT (IEEE ICPADS2004), COT (ACM CSCW2006), GOOLGO WAVE and Docs OT (2010), and POT (IEEE TPDS2016), to name just a subset of the vast co-editing landscape. It is important to acknowledge that these examples do not encompass a considerable number of industry and open-source OT solutions as well.

Having been involved in co-editing since the 1990s, I have extensive familiarity with various OT solutions, including those mentioned in "The Art of the Fugue" paper (refer to [2]). However, I must admit that I was previously unaware of the interleaving problem in any of these solutions. Naturally, the report in "The Art of the Fugue" paper has sparked my curiosity, prompting me to review these early works to identify any overlooked aspects and evaluate the validity of "The Art of the Fugue" paper's claims.

Regrettably, after thorough review, I find the claims in the paper regarding these co-editing solutions to be unfounded. To share my findings with fellow researchers and practitioners, I will publish a series of reviews on HN. These reviews will address each solution in relation to "The Art of the Fugue" paper, aiming to contribute to knowledge advancement in co-editing and provide accurate information to those who depend on it.

What’s wrong with “The art of the Fugue” paper about OT (adOPTed)?

Let's examine the original text from the paper (A.1.1. page 18) that attempts to illustrate the "interleaving" problem in the adOPTed algorithm:

   “To demonstrate forward interleaving, replica A generates ins(1, a, A) followed by ins(2, b, A). Concurrently, replica B generates ins(1, x, B). Assume A < B and consider the execution at B:

   1. B executes ins(1, x, B), so the document is x.

   2. When B receives ins(1, a, A), it computes the transformation T(ins(1,a,A),ins(1,x,B))=ins(1,a,A) because 1=1 & A<B so a is inserted before x, yielding ax.

   3. When B receives ins(2, b, A), it computes the transformation T (ins(2, b, A), ins(1, x, B)) = ins(3, b, A) because 2 > 1, so b is inserted after x, yielding axb.”

However, there is a flaw in the illustration, particularly in Step 3. According to the paper, when replica B receives ins(2, b, A), it computes the transformation T(ins(2, b, A), ins(1, x, B)) = ins(3, b, A) because 2 > 1. However, the correct transformation under the adOPTed algorithm would be T(ins(2, b, A), ins(2, x, B)) = ins(2, b, A) because 2=2 & A<B. Therefore, b should be inserted before x, yielding abx, which is the correct and non-interleaved outcome.

Additionally, it is worth noting that the adOPTed control algorithm should produce ins(2, x, B) in Step 2 using the transformation T(ins(1, x, B), ins(1, a, A)) because 1=1 & A<B. This important aspect was missing in "The Art of the Fugue" paper.

To gain a deeper understanding of the underlying issues involved, let's further explore an illustration of what would occur at replica A (not depicted in "The Art of the Fugue" paper) under the same scenario:

     1. Replica A sequentially executes ins(1, a, A) and ins(2, b, A), resulting in the document being ab.

     2. When A receives ins(1, x, B), it computes two transformations in sequence: a. T(ins(1, x, B), ins(1, a, A)) = ins(2, x, B) because 1=1 & A<B. b. T(ins(2, x, B), ins(2, b, A)) = ins(3, x, B) because 2=2 & A<B.

   As a result, x is inserted after ab, yielding abx at replica A.

It is evident that the outcome abx at replica A differs from the result axb at replica B, as depicted in "The Art of the Fugue" paper. However, the result abx at replica A aligns with the corrected result abx at replica B, and both results exhibit non-interleaving behavior. With the inclusion of this additional illustration at replica A, it becomes apparent that what was depicted in "The Art of the Fugue" paper does not accurately reflect how the adOPTed algorithm truly functions. Instead, it portrays the behavior of the dOPT algorithm, which was the first OT control algorithm. The co-editing scenario presented in the paper represents an instance of the classic dOPT-puzzle, which highlights a flaw in the dOPT algorithm:

    (ins(1, a, A) -> ins(2, b, A)) || ins(1, x, B)

In this scenario, the operations at replica A are causally related and concurrent with the operation at replica B. It's well-known that the dOPT algorithm produces inconsistent results in this straightforward scenario. This emphasizes the need to accurately represent the behavior of the adOPTed algorithm and distinguish it from dOPT to ensure a thorough understanding of the intricacies involved in co-editing solutions. The resolution to the dOPT-puzzle, which had been solved for approximately 30 years, remained largely unknown and under-appreciated by subsequent contributors in the field of co-editing. This lack of awareness regarding the insights and lessons gained from solving this puzzle is a significant factor behind numerous misconceptions and flawed arguments surrounding OT.

Therefore, it is important to highlight that the key aspect of solving the dOPT-puzzle, achieved by OT control algorithms like adOPTed, JUPITER, GOT, GOTO, COT, POT, and GOOGLE OT, is to ensure context-equivalence between the two input operations processed by the transformation function. Context-equivalence, as further explained in OTFAQ [2], means that the operations are defined on the same state. In the provided example, ins(2, b, A) is context-equivalent to ins(2, x, B), whereas it is context-inequivalent to ins(1, x, B). Emphasising this principle is crucial for a comprehensive understanding of OT and to avoid misconceptions in its application.

Lastly, I want to address why I excluded TTF from the previous discussions. The reason is that TTF does not bear relevance to illustrating the "interleaving" issue or resolving the dOPT-puzzle. When TTF is not integrated with a suitably correct OT control algorithm like adOPTed but instead combined with a flawed one like dOPT (as portrayed in "The Art of the Fugue" paper), inconsistencies inevitably arise. This is evident when we combine the illustration at replica B in the paper with the additional illustration at replica A, demonstrating the effect of integrating TTF with a dOPT-like algorithm.

In conclusion, it is vital to grasp the resolution to the dOPT-puzzle and recognise the significance of ensuring context-equivalence in OT control algorithms like adOPTed, JUPITER, GOT, GOTO, COT, POT, and GOOGLE OT, among others. This understanding is crucial in dispelling misconceptions and flawed arguments surrounding OT and preventing the repetition of similar mistakes in the future.

[1] David Sun, Chengzheng Sun, Agustina Ng, and Weiwei Cai. Real differences between OT and CRDT in correctness and complexity for consistency maintenance in co-editors. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW1), May 2020. doi:10.1145/3392825. Also available at https://arxiv.org/abs/1905.01302, May 2, 2019.

[2]. Chengzheng Sun, “OTFAQ: Operational Transformation Frequently Asked Questions and Answers,” https://www3.ntu.edu.sg/scse/staff/czsun/projects/otfaq/