| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by haadcode 2291 days ago

> Last-Writer-Wins is a conflict resolution strategy that can be used by any kind of data type that needs conflicts resolved, CRDTs included. Unfortunately it's not a very good one: even if you use vector clocks instead of wall clocks, it doesn't give you much stronger guarantees than determinism. That is, given two concurrent writes, the winner is essentially arbitrary. LWW is a merge strategy of last resort; if that's the only thing your CRDT system offers, I'm not sure it's really fair to call it a CRDT system.

Can't reply to the comment below, so replying here.

I believe what markhenderson was trying to say is that in OrbitDB, the default merge strategy for concurrent operations is LWW.

The comment above is conflating a lot of things here. 1) determinism is exactly the guarantee one needs for CRDTs, and I'd argue generally is a good thing in distributed system but 2) adding vector clocks (OrbitDB uses Lamport clocks, or Merkle Clocks [1], by default), nor wall clocks, have nothing to do with determinism and in fact there's a good reason to not use vector clocks by default: they grow unbounded in a system where users (=IDs) are not known. In my experience, LWW is a good baseline merge strategy.

I don't think it's at all correct to say that "the winner is essentially arbitrary" because it's not. The "last" in LWW can be determined based on any number of facts. For example "in case of concurrent operations, always take the one that is written by the ID of the user's mobile device", or "in case of concurrent operations, always take the one that <your preferred time/ordering service> says should come first". It'd be more correct say "the winner is based on the logical time ordering function, which may not be chronological, real world time order".

As for the last comment, I'm pretty sure it's a CRDT system :) Want to elaborate your reasoning why you think it's not a CRDT?

[1] "Merkle-CRDTs: Merkle-DAGs meet CRDTs" - https://arxiv.org/abs/2004.00107

1 comments

sagichmal 2291 days ago

OK, I've read the paper; can you help me reason through a scenario?

As I understand it, the Merkle-CRDT represents a Merkle tree as a grow-only set of 3-tuples. When you add a new event to the thing (as a tuple) you have to reference all of the current concurrent root nodes of the data structure, in effect becoming the new single root node; and your event data, which must be a CRDT, gets merged with the CRDTs of those root nodes. Do I have it right so far?

Assuming yes, let's say you have causality chain like so:

    1 --> 2 --> 3 --> 4 
           `--> 5 --> 6

Two root nodes, 4 and 6. Two concurrent histories, 3-4 and 5-6. It's time to write a new value, so I create a new tuple with references to 4 and 6, and merge their CRDT values. Last Writer Wins, right? So either 4 or 6 dominates the other. Whoever was in the other causal history just... lost their writes?

link

haadcode 2291 days ago

almost! :) let me elaborate on few points.

> you have to reference all of the current concurrent root nodes of the data structure, in effect becoming the new single root node

correct, and more precisely the union of heads is the current "single root node". in practise, and this is where the merge strategy comes in, the "latest value" is the value of the event that is "last" (as per LWW sorting).

> and your event data, which must be a CRDT, gets merged with the CRDTs of those root nodes.

the event data itself doesn't have to be a CRDT, can be any data structure. the "root nodes" (meaning the heads of the log) don't get merged with the "event data" (assuming you mean the database/model layer on top of the log), the merge strategy of the log picks the "last/latest event data" to be the latest value of your data structure.

> It's time to write a new value, so I create a new tuple with references to 4 and 6, and merge their CRDT values.

when a new value is written, correct that the references to 4 and 6 are stored, but the new value doesn't merge the values of the previous events and rather, it's a new value of its own. it may replace the value from one or both of the previous events, but that depends on the data model (layer up from the log).

  1 --> 2 --> 3 --> 4 
         `--> 5 --> 6

> Last Writer Wins, right? So either 4 or 6 dominates the other. Whoever was in the other causal history just... lost their writes?

no writes are lost. the result in your example depends what 4 and 6 refer to. in a log database, the ordered log would be eg. 1<-2<-3<-5<-4<-6, so all values are preserved. in the case of a key-value store, it could be that 4 is a set operation to key a and 6 is a set operation to key b, thus the writes don't effect each other. if 4 and 6 are both a set operation on key a, it would mean that key a would have the value from 6 and the next write to key a would overwrite the value in a. makes sense?

link

sagichmal 2291 days ago

> in a log database, the ordered log would be eg. 1<-2<-3<-5<-4<-6

How do you know that? It's not inferrable from the DAG. Is sequencing also provided "a layer up"?

> if 4 and 6 are both a set operation on key a, it would mean that key a would have the value from 6 and the next write to key a would overwrite the value in a.

Yes, I mean for all of my events to be reads and writes of the same key. And you've proven my point, I think: if the resolution of this causal tree is Last-Writer-Wins, 6-domiantes-4, and "key a [gets] the value from 6", then whichever poor user was operating on the 3-4 causal branch has lost their writes.

This is a problem! If you claim to be a CRDT and offline-first or whatever, then as a user, I expect that the operations I make while I'm disconnected aren't just going to be destroyed when I reconnect, because someone else happened to be using a computer with a lexicographically superior hostname (or however you derive your vector clocks).

And if you want to say something like, well, when the unlucky user reconnects and sees that their work has been overwritten, they can just look in the causal history, extract their edits, and re-apply them to the new root -- there's no reason for any of this complex machinery! You don't need CRDTs to just replicate a log of all operations. Of course, you also can't do any meaningful work with such a data structure as a foundation, because it immediately becomes unusably large.

link

pas 2291 days ago

It's inferable from the fact that the writer saw both chains and produced a new node that merges these heads. So that writer "resolves" the conflict - according to whatever strategy it is programmed to use. (It might be as simple as just storing a JSON that says {"conflict": true, "values": [4, 6]} , and the user will have to pick.)

If it's possible to model operations in a commutative way (eg. instead of assigning values to keys one just stores differences), then the conflict resolution is mathematically proven, just apply all operations in whatever order, they're commutative, great. Of course it doesn't help with "real world data", but that's where mathematically we can use and oracle (the user, or whatever linearizer service we choose).

link

haadcode 2291 days ago

> How do you know that? It's not inferrable from the DAG. Is sequencing also provided "a layer up"?

I jumped the gun there and made an assumption that the value of a node is the LWW ordering :) Ok, so without that assumption, the DAG

  1 --> 2 --> 3 --> 4 
         `--> 5 --> 6

...are the values of the operations that the DAG represents, ie. values of a key, so we need to look at the Lamport clocks (or Merkle Clocks when the operations are hashed as a merkle dag) of each operation, represented here as ((ts, id), key, value):

  ((0, x), a, 1) --> ((1, x), a, 2) --> ((2, x), a, 3) --> ((3, x), a, 4)
                                   `--> ((2, y), a, 5) --> ((3, y), a, 6)

which one is the latest value for key a? Which updates, semantically, were lost? In a non-CRDT system, which value (4/x or 6/y) is or should be displayed and considered the latest?

> This is a problem! If you claim to be a CRDT and offline-first or whatever, then as a user, I expect that the operations I make while I'm disconnected aren't just going to be destroyed when I reconnect, because someone else happened to be using a computer with a lexicographically superior hostname (or however you derive your vector clocks).

You're conflating the data(base) model with the log and we can't generalize that all cases of data models or merge conflict are cases of "I expect all my operations to be the latest and visible to me". They are semantically different. If the writes are on the same key, one of them has to come first if the notion of "latest single value" is required. If the writes are not on the same key, or not key-based, multiple values appear where they need to. What we can generalize is that by giving a deterministic sorting function, the "latest value" is the same for all participants (readers) in the system. From data structure perspective this is correct: given same set of operations, you always get the same result. For many use cases, LWW works perfectly fine, and if your data model requires a "different interpretation" of the latest values, you can pass in your custom merge logic (=sorting function) in orbitdb. The cool thing is, that by giving a deterministic sorting function for a log, you can turn almost any data structure to a CRDT. How they translate to end-user data model will depend (eg. I wouldn't model, say, "comments on a blog post" as a key-value store).

If you're curious to understand more, I think the model is best described in the paper "OpSets: Sequential Specifications for Replicated Datatypes" [1]. Another two papers, from the same author, that may also help are "Online Event Processing" [2] and "Moving Elements in List CRDTs" [3] which show how by breaking down the data model to be more granular than "all or nothing", composing different CRDTs give arise to new CRDTS, which I find beautiful. Anything, really, that M. Kleppmann has written about the topic is worth a read :)

[1] https://arxiv.org/pdf/1805.04263.pdf [2] https://martin.kleppmann.com/papers/olep-cacm.pdf [3] https://martin.kleppmann.com/papers/list-move-papoc20.pdf

link

sagichmal 2291 days ago

OK, I understand now. I guess my points then translate to:

1. Modeling an append-only log as a CRDT is trivial

2. Building a database on top of a "CRDT" append-only log doesn't make the database a CRDT

link

haadcode 2291 days ago

No, on both. See above.

link