Hacker News new | ask | show | jobs
by halfcat 873 days ago
Conflicts themselves are not hard: Keep a directed acyclic graph of immutable records. Changes to a record point to the parent/prior record. Two users update the same record, now you have a tree.

The challenge is interpreting what that tree structure should mean.

If you can, let a user decide how to resolve the conflict.

- User logs in, they have a “conflict inbox” of things that need to be resolved.

- Two coworkers make conflicting edits, maybe the manager gets a notification in their conflict inbox and they decide

3 comments

Conceptually that's not hard, but in practice an approach like that can significantly increase the complexity of the app:

1. Do you store the tree structure for every table in your app? If you have 20 tables that could be edited offline, do you re-implement it for each table, or try to have a generic implementation? 2. Do you design all your tables around the tree structure, or do you just store it in addition to your "normal" tables? 3. Every piece of code that modifies one of these tables need to do it via the tree structure - if you update your tables directly from any place it could effectively cause conflicts. 4. Do you build separate UI to resolve conflicts for every table? 5. Do you query and cache the tree structure on the device, or does it have to be online to resolve conflicts? 6. Do you expose the tree structure via external APIs, or keep it internal?

I find that "last-write-wins" is sufficient for a large percentage of cases, and much simpler to implement. Or in some cases, just doing conflict detection is sufficient (notify the user that the data has changed between loading and saving, and they need to re-apply their edits).

If you do need conflict resolution on a large scale (many different tables), I'd recommend using data structures designed for that. CRDTs is one example - while it is typically used for automatic conflict resolution, it often stores enough data to allow manual resolution if desired.

I wrote a Swift library to experiment with this architecture using SQLite.

https://github.com/gerdemb/SQLiteChangesetSync

The library works at the database-level and stores databases modifications as binary change sets in a separate table that models a graph similar to a git repository. Capturing modifications is as simple as wrapping the transaction with a handler provided by the library. The graph of database modifications is stored locally on each device, but can easily by synced with an online repository.

The library detects conflicts, and provides a handler to the application for conflict resolution.

If we want to discuss terminology, I agree I should have said "handling conflicts" instead of just "conflicts".

What you are describing is just the beginning of conflict handling. The consequences for bad handling are dire: data loss. If conflict handling and resolving was easy (hint: it's not), the article would have mentioned it.

Thank you. Do you see the difficulty in implementing the data structure to handle this? Or in the decision about what to do about it? Or about how to automate resolution?

Let’s take git as an example. If we both push changes to our branch, there’s no problem. We have a git repo with different branches. For a single record, this is even simpler, just an append-only table with a foreign key to the prior state.

If someone reviews a PR and finds a merge conflict, it gets handled. Maybe one wins, both get rejected, or both get accepted (a fork). But there’s no requirement that data be discarded.

But automating it seems impossible in all circumstances since it depends on the human intent.

I created an experimental Swift package that implemented this architecture using SQLite.

https://github.com/gerdemb/SQLiteChangesetSync

As you noted, detecting conflicts is easy, but handling them is not.