Hacker News new | ask | show | jobs
by CGamesPlay 1432 days ago
So what is needed here is some sort of data type that is replicated, and guaranteed to converge. A convergent, replicated data type… a CRDT! In this case it rather seems like you wanted a regular-old time stamp, not a lamport timestamp, which gives you a “last-write-wins register” CRDT.

But perhaps you wanted something that actually handles the conflicts, like for a list of subscribers to an event. In that case you would want to use a vector clock instead of a lamport timestamp, and then when neither clock dominates the other (aka when they are tied), you take both sets and merge them. This is an incomplete outline of the “add-wins set” CRDT.

I wish CRDTs were more mainstream :)

2 comments

While I agree that CRDTs could work here, what struck me is you comment about a “regular-old timestamp”. In distributed systems, this isn’t simple. In fact, Lamport Clocks are a specific solution to guaranteeing a consistent monotonically increasing clock such that one time is known to occur after the other. A single instance, or primary db, can provide this, but in distributed systems this is the primary problem. There is no “regular-old timestamp” in distributed systems.
Right. You can't guarantee clocks in a distributed system are synchronised, unless you control all the nodes. Lamport clocks don't use time, they work on causal order. This came after that, not this happened at a particular time.
Definitely true. In this case, where nodes are browsers (so clocks can easily be synced to within a few seconds of one another) and changes are user settings (so they change less than once an hour), a last-write-wins register seems like a great choice for an atomic setting like a boolean.

For settings where we can more intelligently handle conflicts (e.g. sets), we don't need the timestamps, because we can take advantage of vector clocks instead.

The "back end wins on un-ordered messages" scheme is a kind of vector clock, I suspect.