Hacker News new | ask | show | jobs
by scofalik 91 days ago
I read both parts. Well written, I agree with a lot of stuff.

I am long-time CKEditor dev, I was responsible for implementing real-time collaboration in the editor and the OT implementation.

Regarding the first part of your article. Guess what - CKEditor would output "" :). And even better, if the user who deleted all does undo, you'd get "u" where it was typed originally.

However, I fully agree, that for every algorithm, you will be able to find a scenario where it fails to resolve conflict in a way expected by the user. But we cannot ask user to resolve a conflict manually every time it happens.

Offline editing, as you correctly observed, is more difficult, because the conflicts pile up, and multiple wrong decisions can result in a horrifying final result. I fully agree, that this is not only an algorithmic problem but also a UX problem. Add to this, that in many apps, you will also have other (meta)data that has to be synced too (besides document data).

CKEditor is, in theory, ready for offline editing. From algorithm POV, offline is no different than very very very slow connection (*). In the end, you receive a set of operations to transform against other set of operations. However, currently we put the editor in read-only state when the connection breaks. We are aware, that even if all transformations resolve as expected, then the end result may still be "weird". And even if the end result is actually as expected, the amount of changes may be overwhelming to a person who just got the connection back, so it still may be good to provide some UI/UX to help them understand what happened.

(*) - that is, unless the editing session on the server ended already, and, simply saying, you don't have anything to connect to (to pull operations from).

Regarding OT. I have a feeling that one mistake most people make, is that they take OT as it is described in some papers or article, and don't want to iterate over this idea. To me, this is not just one algorithm, rather an idea of how to think about and mange changes happening to the data.

For CKEditor, from the very beginning, we were forced to innovate over typical OT implementations. First of all we focused on users intentions. Second of all, we needed to adapt it to tree data structure. These challenges shaped my way of thinking - OT is "an idea", you need to adapt it to your project. Someone here asked if there's library for OT, because they want to use it for spreadsheets. I'll say -- write it on your own and adapt it to spreadsheets. You'll discover that maybe you don't need some operations, or maybe you need new operations dedicated for spreadsheets. This is what we ended up doing. @Reinmar already posted this link here, but we describe our approach here: https://ckeditor.com/blog/lessons-learned-from-creating-a-ri....

Circling back to your example with typing and removing whole sentence. This is how you innovate over OT. To us, such deletion is not deleting N singular characters starting from position P. The intention is to remove some continuous range of text. If someone writes inside the range, it just changes the boundary of stuff to remove, but surely we don't want to show some random letters after the deletion happens. We account for that and make modifications in our OT implementation.

Similarly with positions in document. In CKEditor, you can use LivePositions and LiveRanges, which are basically paths in tree data structure. Every position is transformed by operation too. Many features we have base on that.

So, my take here is -- don't bash OT because you based your experience on some simple implementations. Possibly the same is with Yjs. Don't bash CRDTs because Yjs is doing something badly?

And some final words regarding the second part.

We also follow the same pattern as your diagram shows in "How the simple thing works" section. As I was reading through the article, and looking at provided examples, it's hard for me not to think, that what's happening is some kind of an OT-variant, maybe simplified, or maybe adapted to some specific cases. But there are strong similarities between what you described and CKEditor 5, and we use OT. Like, looking at this from top-level view, I could say, "well, we do the same". We have the same loop with conflict resolution, we just call "rebase" a "transformation", and instead "steps" we have "operations".

Also, you say it is 40LOCs, but how much magic happens in `step.apply()`? How much the architecture was made to make it possible? Even Marijn makes this comment here: https://news.ycombinator.com/item?id=47409647.

For comparison, this is CKEditor's file that includes the OT functions to transform operations: https://github.com/ckeditor/ckeditor5/blob/master/packages/c.... It's 2600LOCs (!), but at least most of it are comments :). Again, the basic idea for OT is very simple (and this implementation could be simpler, we also learned a lot in the process). It's up to you how much you want to delve into solving "user intention" issues.

1 comments

> Also, you say it is 40LOCs, but how much magic happens in `step.apply()`?

Right, but if you are already using ProseMirror that infrastructure is in place if you are taking advantage of it directly or bolting Yjs on top..