|
It's the Byzantine Fault Tolerant part of this that is particularly innovative and based on Kleppmanns most recent work. I believe it solves the issue of either malicious actors in the networks modifying others transactions, spoofing them, or the messages being modified by third parties ("outside" the network) who have MITM the connection. These are really great problems to solve. However, when I was experimenting with CRDTs a while back, it seemed to me the other big issue is where multiple transactions from different users combine to create an invalid state. Most CRDT toolkits are aiming for a combination of a JSON like type structure, with the addition on specific structures for rich text. The JSON structures for general purpose data, and those for rich text as that is the most common current use case. These sort of general purpose CRDTs don't have a way to handle, say, the maximum length of an array/string, or minimum and maximum values for a number. They leave this to the application using the toolkit. For the Yjs + ProseMirror system, effectively the CRDTs are resolved outside of ProseMirror. Thats useful as they can be combined/resolved on the server without a ProseMirror instance running. However there is a strong possibility that the resulting structure is no longer valid for your ProseMirror Schema, this is currently handled by ProseMirror throwing away the invalid parts of the document when it loads. What I think is needed is a Schema system that is a layer either on top of these toolkits, or as part of them, that provides rules and conflict resolution. So there is a way to specify the maximum length of an array/string, or what the maximum value of a number is. Effectively generic CRDTs that have an understanding of developer supplied rules and bounds. The "maximum length" is an interesting one, as depending on the order of transactions you could end up with a different result. |
Havent had documents corrupted by Yjs allowing changes that are not parseable by schema though - has this happened to you?
And about the schema layer on top of Yjs, you possibly could inspect every update and apply some validation rules. Arent all operations just inserts, updates or deletes of nodes? You can at least rollback to previous version as you flush the updates to the doc in the db. Not ideal though.