Hacker News new | ask | show | jobs
by strls 1398 days ago
A CRDT is a way to solve multi-leader replication without having the application code resolve conflicts.

This is what is required to build an app where any instance (node) can be offline for an arbitrary amount of time, but still be able to share state with the rest of the nodes when it's reconnected.

To implement this, every application node keeps a vector clock per register (an atomic piece of shared state). The vector clock allows any node the compare its own version of the register with the state received from any other node. Two values of a vector clock can either be causally related (in which case the most recent write wins) or concurrent. However the concurrency is from the system's perspective, but not necessarily from the user's perspective. An extra physical timestamp can be kept at the register level to order concurrent updates in a way consistent with the user's time perception.

Now, having the hybrid clocks in place to version each register on each node, the system must implement a protocol to ship every register update to all nodes (reliable broadcast).

Once all updates are shipped to all nodes, it's guaranteed that all nodes have the same (most recent) state.

(I built an offline-first product and had to roll my own protocol)

1 comments

Nice writeup.

I've read the CRDT paper but never implemented it. Question - if you're not using LWW (instead you have concurrent values of a vector clock), this is where you have your CRDT and merge the states coming from every node?