Hacker News new | ask | show | jobs
by holtwick 729 days ago
Thanks for the detailed feedback.

The growth of the log is indeed a weak point that could be improved by regularly merging entries. Missing entries are easily recognizable because a consecutive index is used. The checksums on the previous entry should improve data consistency.

The point that CRDTs themselves already contain all the information required for an update is absolutely correct. I have been working on this protocol for some time and one objective was the reproducibility of the individual changes fro accountability reasons. But this may not be necessary for all applications and could possibly be achieved in other ways. Thank you for pointing this out, I will reconsider the concept in this respect!

1 comments

I would like to refine my answer regarding the rapidly growing log. If we assume that we have a real-time application, then every keystroke or pointer action can indeed create a change entry.

But this storage format is designed for "long term" and "slow" operations. Where "slow" means in the time lapse of a second instead of a milisecond. This allows us to combine multiple changes into a single log entry.

CRDT implementations like Yjs are good at concentrating such changes into smaller chunks of data. For example, writing text in a rich text editor like Prosemirror is then reduced to something like a string and a position.

But the UI can also be lazy and throttle things. A string input field can only fire changes when the field is left or not typed for a second or so.

These steps will significantly reduce the size of the log. They did in my implementations.

But this is not the end of realtime for such applications. These applications could still pass changes directly over P2P, as long as the log remains consistent, so that the resulting document will always eventually consistent.