Hacker News new | ask | show | jobs
by jitl 697 days ago
With any collaborative application, you need to start with a conceptual framework for the edits a user may perform, and how to best preserve the intention of the user (1) and the coherency of the resulting document (2) when any such edits may occur asynchronously. Even if a document is data-intensive in its concrete representation, the way you encode the user's discrete edits & operations can still be tiny.

Let's say we're building an image editor like Photoshop. An uncompressed 102 megapixel image with 16-bit color depth per channel (a photo from a Fujifilm GFX100 camera) would be about 610 MB as a TIFF. Representing each pixel of the image as a separate last-write-wins register would impose a high overhead, but such a representation doesn't actually make any sense to preserve a user's intention. The edits the user will perform are things like "increase image contrast by 15%" or "paint spline [(0,0), (1500, 1500)] with brush Q and color #000". If we sync each pixel by lamport timestamp, we could end up with user 1's contrast applying to all pixels except for those painted by user 2, which would give a weird looking image with painted-over pixels looking out of place.

Instead you'd probably want to represent user intention as a list of edit operations, which are much smaller than a whole 102MB pixel grid. A CRDT data structure is one possible technical mechanism perform to synchronize that user intent, but you pick the structure to match the user intent semantics, not to match the concrete data layout of your output.

You may still end up having edit operations that contain massive amounts of data, like "add new layer named `bg` below layer `fg` with pixels `data:(10mb of pixels)` at (1500, 1500)". But the overhead for the synchronization of that kind of edit command is very low, it's size is O(1), not O(pixels in the edit command).