Hacker News new | ask | show | jobs
by tmikaeld 1766 days ago
That's what Replicache[0] solves, it provides for Causal+ Consistency across the entire system.

"This means that transactions are guaranteed to be applied atomically, in the same order, across all clients. Further, all clients will see an order of transactions that is compatible with causal history. Basically: all clients will end up seeing the same thing, and you're not going to have any weirdly reordered or dropped messages."

[0] https://doc.replicache.dev/design

Note: There's more in their links, but the linked sites are down..

2 comments

It appears Replicache doesn't use CRDTs since it has a central source of truth: https://news.ycombinator.com/item?id=22175530

See also the commentary here: https://doc.replicache.dev/guide/local-mutations

This sounds a lot like Operational Transform but without the transform part - it assumes that locally applied mutations can be undone and rebased without user interaction. But I feel like the Google Wave team would have a lot of objections to the idea that this can just be ignored. If your state is just a group of key value stores where last write wins and everyone can agree on who's last, that's fine, but text/token streams require a notion of transformation that I'm worried Replicache simply glosses over.

Indeed, there can never be one universal solution to this, because the problem is one of specification rather than (only) implementation.

For example, suppose we have an edit/delete conflict, where two clients concurrently interact with the same entity in your data model. In a simple case, we can decide to “resurrect” the affected entity and apply the edit, which is the option that never results in significant data loss and so might be a reasonable behaviour if no user interaction is involved.

Now, what if there were other consequences of deleting that entity? Maybe the client that deleted the entity then created a new entity that would violate some uniqueness constraint if both existed simultaneously. Or maybe it wasn’t the originally deleted entity that would violate that constraint, but some related one that was also deleted implicitly because of a cascade. How should we reconcile these changes, if simply allowing either one to take precedence means discarding data from the other?

At least if all clients are communicating in close to real time, it’s unlikely that any one of them will diverge far from the others before they get resynchronised, so the scope for awkward conflicts is limited. But in general, we might also need to support offline working for extended periods, when multiple clients might come back with longer sequences of potentially conflicting operations, and there’s no general way to resolve that without the intervention of users who can make intelligent decisions about intent, or at least a set of automated rules that makes sense in the context of that specific application. And in the latter case, we’d still probably want to prove that our chosen rules were internally consistent and covered all possible situations, which might not be easy.

> How should we reconcile these changes, if simply allowing either one to take precedence means discarding data from the other?

Exactly. This is why Replicache expresses change as high-level operations, like createPost or deletePerson that are application-defined.

Replicache doesn't try to automatically merge the effects of concurrent mutations, it just replays the mutations in the same order on each client. It's up to the implementation of the mutation to decide what the correct result is, and that answer can and often does change when the mutation is replayed on top of different states.

Because Replicache mutations are atomic, applications can also enforce invariants such as uniqueness or even more complex app-level invariants.

Imagine, for example, a calendaring application. An application built with Replicache can enforce the invariant that a room is only booked by one event in one time slice even under concurrent edits, just using normal programmatic validation. It's hard to do this kind of thing with CRDTs or other approaches to automatic merging because the data model knows nothing about the application's constraints.

It's a pretty simple-minded system, actually, but our experience is that it is a nice way to think about these problems and provides good results for many types of data, in particular structured data.

The good old CAP theorem hits again...
I’m not sure if you are understanding that when Replicache rebases operations locally it actually re-executes code which can have arbitrary effects. This design yields a lot of flexibility to preserve intent: the function can look at current state of world and decide to do something different.

Now, it is true that OT is considered the gold standard for certain kinds of collaborative editing, in particular unstructured text. But CRDTs are quickly catching up and I believe that any CRDT should by definition be implementable on top of Replicache.

Its also quite a lot easier to implement a Replicache backend than an ot backend.

I don’t know enough to comment on replicache, but you can also do OT on top of an operation based CRDT. For diamond types we’re making it support both - so if you want to, applications can do OT (which is simple, small, and fast) to talk to a server (or local proxy process), and then that process can do p2p server to server replication using CRDTs.

The result is we need way less complexity in the browser, or in applications. And still get all the advantages crdts bring - namely, no need for a central server acting as the source of truth.

Cool, I need to look into this more.

I think for many customers the authoritative server is an advantage. It's useful in SaaS apps for the server to be able to override the clients, for all kinds of reasons -- antiabuse, authorization, extra validation rules, or just fixing bugs.

Yes, I completely agree. And I think we want both:

- A fast and well written CRDT that works in p2p networks should also work great for server-to-server replication in a data center (or across data centers).

- OT algorithms designed to work with centralized servers are simple, efficient, easy to code up and easy to work with. And they provide a really nice API for local applications to do IPC. CRDT libraries can expose OT endpoints just fine.

I'm still not 100% sure about what the best approach is in the P2P case. Embedding (/ linking) a CRDT library into every application would also work fine, but its complicated to get everything working across languages. And harder to update. The other option is running a single system / applicatoin wide CRDT-like service which manages credentials, that applications talk to like LSP / D-Bus. In that case, applications can just talk OT (which is much simpler).

Either approach would work.

I'd rather it was configurable, since there's different use-cases for both and it can be in the same app. So you're definitely making a valid point.
How one wants to see them could depend; that's why I recommend using an RDBMS. One can "play back" transactions using different orders and filters. If teams get confused or accidentally "step on each others toes", then one may need to review different scenarios to see what was intended by two or more parties.