Hacker News new | ask | show | jobs
by samwillis 93 days ago
It's disingenuous to suggest that "Yjs will completely destroy and re-create the entire document on every single keystroke" and that this is "by design" of Yjs. This is a design limitation of the official y-Prosemirror bindings that are integrating two distinct (and complex) projects. The post is implying that this is a flaw in the core Yjs library and an issue with CRDTs as a whole. This is not the case.

It is very true that there are nuances you have to deal with when using CRDT toolkits like Yjs and Automerge - the merged state is "correct" as a structure, but may not match your scheme. You have to deal with that into your application (Prosemirror does this for you, if you want it, and can live with the invalid nodes being removed)

You can't have your cake and eat it with CRDTs, just as you can't with OT. Both come with compromises and complexities. Your job as a developer is to weigh them for the use case you are designing for.

One area in particular that I feel CRDTs may really shine is in agentic systems. The ability to fork+merge at will is incredibly important for async long running tasks. You can validate the state after an agent has worked, and then decide to merge to main or not. Long running forks are more complex to achieve with OT.

There is some good content in this post, but it's leaning a little too far towards drama creation for my tast.

1 comments

Author here, sorry if this was not clear: that specific point was not supposed to be an indictment of all CRDTs, it was supposed to be much more narrow. Specifically, the Yjs authors clearly state that they purposefully designed its interface to ProseMirror to delete and recreate the entire document on every collab keystroke, and the fact that it stayed open for 6 YEARS before they started to try to fix it, does in my opinion indicate a fundamental misunderstanding of what modern text editors need to behave well in any situation. Not even a collaborative one. Just any situation at all.

I think it's defensible to say that this point in particular is not indicting CRDTs in general because I do say the authors are trying to fix it, and then I link to the (unpublicized) first PR in that chain of work (which very few people know about!), and I specifically spend a whole paragraph saying I hope that I a forced to write an article in a year about how they figured it all out! If I was trying to be disingenuous, why do any of that?

> sorry if this was not clear

It's easy to make that mistake reading your post because of sentences like

> I want to convince you that all of these things (except true master-less p2p architecture) are easily doable without CRDTs

> But what if you’re using CRDTs? Well, all these problems are 100x harder, and none of these mitigations are available to you.

It sure sounds a lot like you're calling CRDTs in general needlessly complex, not just the yjs-prosemirror integration.

To be clear, we ARE arguing CRDTs needlessly complex for the centralized server use case. What I am describing in the "delete and replace all on every keystroke" problem is the point at which it became clear to me that the project did not understand what modern text editors need to perform well in any circumstance, let alone a collab one.

I think this is still reasonable to say because the final paragraph in that section is 100% about how they might fix the delete-all problem, and I hope they do, so that I can write about that, too. But also, that the rest of the article is going to be about how you have to swim upstream against their architecture to accomplish things that are either table stakes or trivial in other solutions.

> To be clear, we ARE arguing CRDTs needlessly complex for the centralized server use case.

I've been working in the OT / CRDT space for ~15 years or so at this point. I go back and forth on this. I don't think its as clear cut as you're making it out to be.

- I agree that OT based systems are simpler to program and usually simpler to reason about.

- Naive OT algorithms perform better "out of the box". CRDTs often need more optimisation work to achieve the same performance.

- But with some optimisation work, CRDTs perform better than OT based systems.

- CRDTs can be used in a client/server model or p2p. OT based systems generally only work well in a centralised context. Because of this, CRDTs let you scale your backend. OT (usually) requires server affinity. CRDT based systems are way more flexible. Personally I'd rather complex code and simpler networking than the other way around.

- Operation based CRDTs can do a lot more with timelines - eg, replaying time, rebasing, merging, conflicts, branches, etc. OT is much more limited. As a result, CRDT based systems can be used for both realtime editing or for offline asyncronous editing. OT only really works for online (realtime) editing.

(For anyone who's read the papers, I'm conflating OT == the old Jupitor based OT algorithm that's popular in google docs and others.)

CRDTs are more complex but more capable. They can be used everywhere, and they can do everything OT based systems can do - at a cost of more code.

You can also combine them. Use a CRDT between servers and use OT client-to-server. I made a prototype of this. It works great. But given you can make even the most complex text based CRDT in a few hundred lines anyway[1], I don't think there's any point.

[1] https://github.com/josephg/egwalker-from-scratch

The algorithm in prosemirror-collab-commit is inspired by Google Wave, and implemented as a slight tweak to the prosemirror-collab system. The tweak is in the name.

I'm not sure about classical OT, and it's been a really long time since I wrote prosemirror-collab-commit, but.. On the authority it's more like nth triangle where n is the number of concurrent commits being processed based off the same document version for mapping. So 50 clients sending in commits at the same based off the same doc version would be (50 * 51) / 2. Applying has a different, potentially larger, cost and that's O(n).

You don't have to have server affinity, but I'd be cooler if you did. Locks you need.

It works offline, sure. For some definition of "works". The further histories diverge the greater the chance of losing user intent. But that really depends on the nature of the divergence. Some inspection and heuristics could probably be used to green-light a LOT of "offline" scenarios before falling back on an interactive conflict resolution strategy.

I'm not sure what magic CRDTs exist today, but in the case of Yjs and ProseMirror allowing histories to drift too far will absolutely risk stomping all over user intent when they are brought back together.

The magic of CRDTs does not prevent this. They are in exactly the same boat as OT, prosemirror-collab and prosemirror-collab-commit. It can't be prevented. The problem is worse with CRDTs because they instantly destroy user intent in the conversion to/from their underlying representation, which is the XML document. See discussion with Marijn about, e.g., splitting blocks above.
Actually, I think I agree with most of this... except the part where you think it's not clear-cut, ha ha. I meant that not as a comparison between OT and CRDTs, but as a comparison to prosemirror-collab. My opinion is that in the centralized server case, it is (unfortunately) basically better in every dimension.
> But with some optimisation work, CRDTs perform better than OT based systems.

I read your paper and I think this is a mistake. You assume that OT has quadratic complexity because you're considering classic operation-based OT. But OT can be id-based, in which case operations are transformed directly on the document, not on other operations. This is essentially CRDT without the problems of supporting P2P, and therefore the best CRDT will never perform better than the best OT.

> CRDTs let you scale your backend. OT (usually) requires server affinity. CRDT based systems are way more flexible. Personally I'd rather complex code and simpler networking than the other way around.

All productivity apps that use these tools in any way shard by workspace or user, so OT can scale very well.

If you don't scale CRDT that way, by the way, you'd be relying too much on "eventual consistency" instead of "consistency as quickly as possible."

> (For anyone who's read the papers, I'm conflating OT == the old Jupiter-based OT algorithm that's popular in Google Docs and others.)

Similar to what I said before. I think limiting OT to an implementation that’s over three decades old doesn’t do OT justice.

> I think limiting OT to an implementation that’s over three decades old doesn’t do OT justice.

I haven't kept up with the OT literature after a string of papers turned out to "prove correctness" in systems which later turned out to have bugs. And so many of these algorithms have abysmally bad performance. I think I implemented an O(n^4) algorithm once to see if it was correct, but it was so slow that I couldn't even fuzz test it properly.

> You assume that OT has quadratic complexity because you're considering classic operation-based OT. But OT can be id-based, in which case operations are transformed directly on the document, not on other operations.

If you go down that road, we can make systems which are both OT and CRDT based at the same time. Arguably my eg-walker algorithm is exactly this. In eg-walker, we transform operations just like you say - using an in memory document model. And we get most of the benefits of OT - including being able to separately store unadorned document snapshots and historical operation logs.

Eg-walker is only a CRDT in the sense that it uses a grow-only CRDT of operations, shared between peers, to get the full set of operations. The real work is an OT system, that gets run on each peer to materialise the actual document.

> This is essentially CRDT without the problems of supporting P2P, and therefore the best CRDT will never perform better than the best OT.

Citation needed. I've published plenty of benchmarks over the years from real experiments. If you think I'm wrong, do the work and show data.

My contention is that the parts of a CRDT which make them correct in P2P settings don't cost performance. What actually matters for performance is using the right data structures and algorithms.

> the fact that it stayed open for 6 YEARS before they started to try to fix it...

This is all opensource software, provided for free by volunteers. If you want better bindings, go write them. Or pay someone else to do so.

Just to be extremely clear: we pay for a lot of OSS software. We pay one individual project more than $10,000 a year. We would have paid for Yjs too, if we thought it was a good use of resources!