Hacker News new | ask | show | jobs
by nathansobo 1293 days ago
We're quite happy with them. Why do you find them annoying?
2 comments

I'll offer up that I've had a hard time wrapping my mind around them and that in practice it seems like you need to implement a check pointing operation on top of them which is necessarily not conflict-free, or your replication log will expand without bound & eventually overwhelm your system. (Though perhaps not, depending on your problem domain. Not CRDTs but in this interview the engineers discuss a massive, high frequency replication log that they don't checkpoint, which they've been running at scale for years. Though you could also say they implicitly checkpoint every trading day, and they're working on implementing checkpointing. https://signalsandthreads.com/state-machine-replication-and-...)

That being said I would use CRDTs for any greenfield collaboration project.

One common failure mode is that two people start typing at the beginning of the same line, and rather than getting two lines, it alternates characters. At least, Etherpad did this.
Etherpad used Operational Transform, not CRDTs.

Source: I have been etherpad's maintainer for two years.

This is called the “interleaving problem”. It shows up with simple list algorithms like fractional indexing.

All the main text editing CRDT algorithms around today solve this no problem. (Yjs, automerge, diamond types, etc).

Has anyone used Yjs in practice? I've tried recently but the docs seem terribly unfinished sadly. And their lacking examples of how to use it for another purpose (other than text editing).
Yjs is being quite heavily used in the industry[1], and being researched by even more companies. There are also demos showing how to integrate it with an existing rich text editors[2]. If you have some ideas about the missing parts, you could also open topic on discuss.yjs.dev - the documentation page (https://docs.yjs.dev) has tons of useful links thou.

Re. other purpose projects - Yjs/Yrs main target are sequential data structures (text, arrays), but it also has support for maps and xml-like elements. In general you can build most data structures with it. I agree that it would be nice to have some other applications in demos though.

[1] https://docs.yjs.dev/yjs-in-the-wild [2] https://github.com/yjs/yjs-demos

check out https://tiptap.dev/hocuspocus if you want to have a simple and plug and play option for text editor
We honestly don’t solve it yet, but haven’t found it that big of an issue in practice. Would be curious to see the best resource on it.
I've described this issue in my blog post together with CRDT variants that address and solve it: https://bartoszsypytkowski.com/operation-based-crdts-arrays-...

In practice, many CRDT libraries nowadays (eg. Yjs and Automerge) are using structures that don't come with interleaving issues.

I remember reading about it in one of Martin Kleppmann’s papers, though I can’t remember which one.

It’s an ordering problem that comes from some of the simpler ordering algorithms. For Diamond types I’m using a variant of Yjs’s ordering. But even RGA doesn’t have this problem because each character’s insert location is specified by naming the character immediately to the left when that character was typed.

This repository implements a few different list CRDTs using an insertion sort approach, where the algorithm scans for the appropriate location every time an insert happens. This is the scanning function for RGA (automerge’s algorithm):

https://github.com/josephg/reference-crdts/blob/fed747255df9...

And this is an interactive visualisation of how diamond types works (which uses Yjs’s algorithm instead), complete with run-length encoding: https://home.seph.codes/public/diamond-vis/

This is a common artifact of operational transforms I believe. Etherpad launched in 2008; CRDTs were proposed in 2011. AFAIK Etherpad used (uses?) OT.

Open to correction though, it's been a while since I dug into the differences in these approaches & my memory is imperfect.