Hacker News new | ask | show | jobs
by GermanJablo 90 days ago
> You don't think Figma is a serious application?

I don't know where this popular belief came from. The Figma blog literally says "Figma isn't using true CRDTs"[1].

> The only real benefit of OT is that its simpler to reason about.

That's incorrect. When you free yourself from the P2P restriction that CRDTs are subject to, there's a huge amount of metadata you can get rid of, just to mention one benefit.

[1] https://www.figma.com/blog/how-figmas-multiplayer-technology...

2 comments

> I don't know where this popular belief came from.

It may be worth reading the whole paragraph of the blog you referenced...

> Figma isn't using true CRDTs though. CRDTs are designed for decentralized systems where there is no single central authority to decide what the final state should be. There is some unavoidable performance and memory overhead with doing this. Since Figma is centralized (our server is the central authority), we can simplify our system by removing this extra overhead and benefit from a faster and leaner implementation.

> It’s also worth noting that Figma's data structure isn't a single CRDT. Instead it's inspired by multiple separate CRDTs and uses them in combination to create the final data structure that represents a Figma document (described below).

So, it's multiple CRDTs, not just one. And they've made some optimizations on top, but that doesn't make it not a CRDT?

Nowhere does it say it's multiple CRDTs. It says "isn't a single CRDT" and that "it's inspired by multiple separate CRDTs." A bit confusing, I agree.

By the way, I work at Figma.

> When you free yourself from the P2P restriction that CRDTs are subject to...

CRDT stands for "Conflict-free replicating data type". They're just, any data type with an idempotent, commutative merge function defined on the entire range of input. Technically, CRDTs have nothing to do with networks at all.

The simplest "real" CRDT is integers and the MAX function. Eg, you think of an integer. I think of an integer. We merge by taking the max of our integers. This is a CRDT. Would it have eventual consistency? Yes! Would it work in a server-to-client setup? Of course! Does it need any metadata at all? Nope! It doesn't even need versions.

There's no such thing as a "P2P restriction". Anything that works p2p also works server-to-client. You can always treat the server and client as peers.

> there's a huge amount of metadata you can get rid of, just to mention one benefit.

Can you give some examples of this metadata? In my experience (10-15 years with this stuff now), I've found you can get the same or better performance out of CRDTs and OT based systems if you're willing to make the same set of tradeoffs.

For example, OT has a tradeoff where you can discard old operations. The cost of doing so is that you can no longer merge old changes. But, you can do exactly the same thing in CRDTs, with the same cost and same benefits. Yjs calls this "garbage collection".

In my own eg-walker algorithm, there's 3 different ways you can do this. You can throw away old operations like in OT based systems - with the same cost and benefit. You can keep old metadata but throw away old data. This lets you still merge but you can't see old versions. Or you can keep old edits on the server and only lazy-load them on the client. Clients are small and fast, and you have full change history.

CRDTs generally give you more options. But more options = more complexity.

I'm no p2p idealist. Central servers definitely make some things easier, like access control. But CRDTs still work great in a centralised context.

Ok, replace "P2P restriction" with "idempotent, commutative restriction".

> For example, OT has a tradeoff where you can discard old operations. The cost of doing so is that you can no longer merge old changes.

Why wouldn’t you be able to? My server receives operations, applies them to the document, and discards them. It can receive operations as old as it wants.

___

> Can you give some examples of this metadata?

Yes, it depends on the CRDT, but if we're talking about lists or tree structures with insert and delete operations, these can come in the form of thombstones, or operation logs, or originRight or originLeft, or DAG. Even with a garbage collector, the CRDT needs to retain some of this metadata.

Yes, you can optimize by not bringing it into memory when it’s not needed. But they’re still there, even though they could be avoided entirely if you assume a central server that guarantees a deterministic ordering of operations.

> Why wouldn’t you be able to? My server receives operations, applies them to the document, and discards them. It can receive operations as old as it wants.

OT needs to do operation transformation. If you get passed old operations, you need to transform them before they can be applied. This requires keeping extra data around. I mean, this is entirely the point of the metadata you describe in the second part of your comment:

> these can come in the form of thombstones, or operation logs, or originRight or originLeft, or DAG. Even with a garbage collector, the CRDT needs to retain some of this metadata.

> But they’re still there, even though they could be avoided entirely if you assume a central server that guarantees a deterministic ordering of operations.

A central server doesn't remove the need for this data though! Lets assume a completely deterministic ordering of operations on the server, like Jupiter. If the server is at version 1000, and I send the server an operation at version 900, the server needs to use information about operations 900-1000 to be able to apply the change. This is true in Jupiter OT - which uses the actual operations. Or in Yjs or Diamond Types or any other collaborative text editing system. We need some of that information to figure out how to transform the incoming change and order it correctly.

At least, I've never seen or heard of any scheme which doesn't need some data like this.

> A central server doesn't remove the need for this data though!

Yes, it's possible. The problem is that we're using different definitions of what OT means. This conversation has converged on the same point we started in another thread; I suggest we continue it there:

https://news.ycombinator.com/item?id=47436249