Hacker News new | ask | show | jobs
by hem777 1167 days ago
We don’t assume “reliable delivery” in AP or eventually consistent systems. We assume “once all messages have been delivered…”

So if you have all messages and the two properties above, a total order can be derived.

You’re correct to say that causal order != total order as such but with the use of correct primitives, like Lamport Causal Clocks, we can get a nice and clean linear order of events :)

1 comments

"correct primitives" do not by themselves provide a linear order of events

      a
     / \
    b   c
     \ /
      d
b and c are concurrent updates, how do you resolve d?

it's a trick question, you can't resolve d without information loss, unless you bring in additional knowledge from the application

you can resolve d with information loss by defining some heuristic for ordering concurrent updates, a.k.a. last-writer-wins, basically picking one of those updates deterministically

that gets you a total order, but it's cheating: whichever concurrent updates you don't choose are lost, and that violates consistency for any replica(s) that are based on that state

there is no free lunch

> "correct primitives" do not by themselves provide a linear order of events

Review the description of Lamport Causal Clock in the article. Note that it carries “additional info” (additional to the example diagram). This “additional info” is what establishes the structure needed for total order.

> whichever concurrent updates you don't choose are lost, and that violates consistency

They’re not lost! The concurrent updates not chosen are still part of the “list of operations”, but the update to the data/value itself may not be observable if the subsequent update (the update that was sorted to be first) updates the same data/value (eg. both operations update the same key in a key-value structure). If the two operations update different data/value, then both updated values are observable. This isn’t cheating, rather it works exactly as expected: it is eventually consistent.

we are only talking about updates to a specific value here, obviously updates to independent values are trivial to resolve

it's possible to construct a CRDT such that concurrent updates are merged without data loss to a single "list of operations" maintained in the object, but that's not true in general

resolving conflicts with the lww strategy, or variants of that strategy that order concurrent events by e.g. node ID, are indeed eventually consistent at the highest level, but they provide no meaningful consistency guarantees to users, because they allows "committed" writes to be lost

> that's not true in general

Can you elaborate what do you mean by this? I was arguing that it’s possible as the original argument was “this is not possible in a real system and is only theoretical”.

> provide no meaningful consistency guarantees to users, because they allows "committed" writes to be lost

If I set the (shared) value to green and you set it to blue, what is the expected observed value? What if you set it to green and I set it blue, what is the observed value? More importantly, what is the consistency that was lost?

so there are multiple threads of conversation here

first, there is no straightforward way to module "a value" as a CRDT, precisely for the reason you point out: how to resolve conflicts between concurrent updates is not well-defined

concretely, if you set v=green and I set v=blue, then the expected observed value is undefined, without additional information

there are various ways to model values as CRDTs, each has different properties, the simplest way to model a value as a CRDT is with the LWW conflict resolution strategy, but this approach loses information

example: say your v=green is the last writer and wins, then I will observe (in my local replicas) that v=blue for a period of time until replication resolves the conflict, and sets (in my local replicas) v=green. when v changes from blue to green, the whole idea that v ever was blue is lost. after replication resolves the conflict, v was never blue. it was only ever green. but there was a period of time where i observed v as blue, right? that observation was invalid. that's a problem. consistency was lost there.

--

second

> almost any data structure can be turned into a (op-based) CRDT.

yes, in theory. but op-based CRDTs only work if message delivery between nodes is reliable. and no real-world network provides this property (while maintaining availability).

> there was a period of time where i observed v as blue, right? that observation was invalid.

Not invalid. The observation was correct at that time and place, meaning, in the partition that it was observed. This is the P in AP.

It seems to me that you’re talking about and arguing for synchronization, that is, consensus about the values (=state), which takes the system to CP as opposed to AP.

> there is no straightforward way to module "a value"

I would recommend to look into the “Register” (CR)data structure.