Hacker News new | ask | show | jobs
by ryuuseijin 3172 days ago
What you describe looks a lot like a Google Wave like OT system. Wave-style OT is eventually consistent, like CRDTs, but you need a central server to give the event history a total order. This is necessary because Wave-style OT is a 1-1 model: clients are 1-1 connected with the server, but not with each other, which would be n-n, which is what CRDTs can do.

The total order of the central server can make the system simpler and more efficient, but by itself it doesn't solve the problem that Wave has, which is allowing a client to edit his text/message without being interrupted by network latency/interruptions -- imagine typing a letter and having to wait for the server to acknowledge that keypress with >100ms latencies. To solve this problem, you still need some form of xform/merge algorithm that OT and CRDT systems provide.

EDIT: I assumed you were not familiar with OT systems since you didn't mention it in your post, but now that I followed your link I can see that you are. In that light, it seems your comment is more a question about what the tradeoffs are between OT and CRDT systems rather than whether a central server can solve all problems without xform/merge logic.

One tradeoff that comes to mind when thinking about OT and CRDT systems is in the way operations track locations in the datastructure. In OT systems you have offsets (small), in CRDTs you have uuids (large) or dynamically growing identifiers (usually small but possibly large). This has implications for the byte-size of operations or the in-memory datastructure.

Another is that CRDTs have a pruning problem. It has been some time since I looked at CRDTs, but I remember that Wave-style OT didn't have the same problem due to the central server. The pruning problem can cause a CRDT to grow larger than it needs to by forcing it to keep more historic data around just in case it gets an old operation it hasn't seen yet. The central server solves this problem by guaranteeing that it will have sent you all old operations before sending you a newer one. If you know all actors in a n-n system you can also solve this issue, but in an unbounded n-n system I didn't see any way this issue can be solved when I was researching it.

EDIT2: Just want to add that there are lots of other problems that are more practical than theoretical. For example, authorization, authorative copy of the data, REST API, things like that, but that would depend more on the exact use case.

2 comments

With CRDTs log entries can be purged once they synchronize with everyone, no need to keep them just in case. Although it's rather implementation specific.
Thanks, I did address that when I said you can solve it if you know all actors in a n-n system, but I should have been clearer by pointing out the solution, which is (as you already said) every known actor acknowledging it.

In an unbounded n-n system I still don't see a solution.

Thanks for this! I think I'm still digesting, but I do find the contrast you're drawing between OT and CRDTs interesting. I don't think I had ever seen the fundamental difference between them to be the network topology, but I can't quite claim to understand either in sufficient depth.
Btw, Wave-style OT needs a central server, but there are other forms of OT that do not. They mention this in your link. If your OT system can satisfy the TP2 propery you can do n-n synchronization but still have what you'd call an OT system.