Hacker News new | ask | show | jobs
by billconan 697 days ago
I wonder if there has been any practical CRDT for data dense applications, such as images (pixels) and 3D models?
6 comments

With any collaborative application, you need to start with a conceptual framework for the edits a user may perform, and how to best preserve the intention of the user (1) and the coherency of the resulting document (2) when any such edits may occur asynchronously. Even if a document is data-intensive in its concrete representation, the way you encode the user's discrete edits & operations can still be tiny.

Let's say we're building an image editor like Photoshop. An uncompressed 102 megapixel image with 16-bit color depth per channel (a photo from a Fujifilm GFX100 camera) would be about 610 MB as a TIFF. Representing each pixel of the image as a separate last-write-wins register would impose a high overhead, but such a representation doesn't actually make any sense to preserve a user's intention. The edits the user will perform are things like "increase image contrast by 15%" or "paint spline [(0,0), (1500, 1500)] with brush Q and color #000". If we sync each pixel by lamport timestamp, we could end up with user 1's contrast applying to all pixels except for those painted by user 2, which would give a weird looking image with painted-over pixels looking out of place.

Instead you'd probably want to represent user intention as a list of edit operations, which are much smaller than a whole 102MB pixel grid. A CRDT data structure is one possible technical mechanism perform to synchronize that user intent, but you pick the structure to match the user intent semantics, not to match the concrete data layout of your output.

You may still end up having edit operations that contain massive amounts of data, like "add new layer named `bg` below layer `fg` with pixels `data:(10mb of pixels)` at (1500, 1500)". But the overhead for the synchronization of that kind of edit command is very low, it's size is O(1), not O(pixels in the edit command).

It is not exactly the same, but I believe that Figma supports concurrent edits and uses an approach similar to CRDTs (https://www.figma.com/blog/how-figmas-multiplayer-technology...).
I'm not sure that CRDTs would be necessary for image editing, since all conflicting edits could easily be resolved with a last-writer-wins approach. 3D models are a different beast, and I haven't seen any collaborative 3D modeling tool on the market (though I haven't actively searched).
There is a 3D modelling tool called Spline supports multiplayer editing. I suppose it's using OT
I sketched out what a performant pixel-based CRDT might look like in my big CRDT article: http://archagon.net/blog/2018/03/24/data-laced-with-history/...

Never tried building it, though. And I’m not sure it would actually be practical, but it would at least preserve the full history of the document.

A cool example I've seen is Modyfi, which is a non-destructive editor for raster graphics. They use Yjs to represent the data, but instead of storing raw pixels, they are storing the entire history of transformations.

https://digest.browsertech.com/archive/browsertech-digest-ho...

What would a concurrent edit look like? If there are two concurrent edits to the same pixel or vertex, what should the result look like? The easiest answer is that you decide arbitrarily (say by timestamp), which is "last-write-wins", which is a CRDT.