Hacker News new | ask | show | jobs
by jtanderson 642 days ago
Thanks for the reply! Will definitely be looking into Ditto as a migration option. What would you say the main benefits are of going schema-less? In this context does it more mean that the client is responsible for serializing/de-serializing into the correct data structures? I'm browsing the docs already but any extra information would be helpful, particularly around handling evolving data needs, import/export, client sync, etc.
1 comments

Yeah the client has to serialize/deserialize - think of the DB as a mini Mongo where you insert JSON into collections as the mental model. Reason for this is that in our experience, sync is easier when the system is schema-less because the nature of the system ties folks across teams, so this means the schema tends to live above each individual client (iOS, Android, backend, etc and in larger orgs those are each separate teams). Furthermore, advances such as Codable in Swift and other patterns means client devs are fairly used to handling this with JSON/REST patterns anyway.

More similar is that Ditto uses a query-based sync approach - this was an area I am particularly passionate about and helped push at Realm. The challenge with this vs. strict channels is then scalability of the backend system - but we have solved these scalability challenges in several ways.

Most prominently, internally Ditto works different than at least my understanding of how Realm Sync worked originally (it might have evolved more under Mongo). We use CRDTs to handle the conflict resolution, so any JSON inserted into Ditto, ends up being a nested CRDT structure on disk. This enables the same predictive conflict resolution, but its a crucial trade off compared to other approaches.

First, CRDTs enable P2P sync where no server is needed to mediate the system. This opens up use-cases that Ditto powers where mobile devices sync over Bluetooth and P2P WiFi even without a device accessing the internet. Furthermore, it also opens up more scalability on the backend because the server nodes themselves can communicate P2P.

At a more fundamental level this is a trade off of using extra metadata (CRDTs include more context like version vectors and such to know how to merge with each other) compared to Realm's original approach of Operational Transformation which is an algorithmic approach. We felt that the metadata approach was holistically better because of the access to new use-cases and scalability meanwhile there are clever ways to make the metadata cost not a real-world issue (i.e. compression in various ways). Conversely, the algorithmic approach is limited to a client server architecture and has computational scalability issues.

All in all, we are very sympathetic to transition costs, but are confident folks will find Ditto can meet the same and more capabilities in Device Sync. Our current challenge is going to be introducing a pricing scheme to support the broader set of Mongo users. To date, we have been very focused on larger enterprise deployments (gotta ensure the bills are paid!) so we will be creative to ensure anyone transitioning is not affected by pricing if they like us!

Got it, this is very interesting and informative. I look forward to learning more about Ditto as I experiment for my migrations away from Realm.