Hacker News new | ask | show | jobs
by im_down_w_otp 2166 days ago
That's not quite right. There are a lot of sound strategies for culling/merging/resolving CRDT state in-part or in-total depending on the use case and/or the topology of the system that interacts with the CRDT.

It's possible to construct a pathological case where it's impossible to soundly GC the CRDT state, and where you have to keep around an arbitrarily long list of per-agent updates or list of agents forever, but that shouldn't be the normative case.

1 comments

Yeah, CRDTs don't require keeping history of every change forever or at all. In other words, all the changes coming from a bad actor can be merged locally into a single change or a small set of changes or whatever is appropriate that will actually be propagated to other nodes. Plus nodes can easily know how far all of them have progressed and drop history before the most far behind point. Only during outages history should grow a bit more than usual.
> Plus nodes can easily know how far all of them have progressed and drop history before the most far behind point.

That may be easy coordinating servers that are almost always online, but it's definitely not easy for desktop/mobile clients that go offline for long periods (and sometimes don't come back).

A middle ground could be a combination of "historic" nodes that keep all known history and "client" nodes that only care about history from their own moment of sync onward, the optimistically drop history via some heuristic.
Do have any pointers to javascript libraries that work this way? All the libs I've looked at (not recently) require the server to keep a running log of updates for the life of the document.

For example, the automerge library linked elsewhere on this thread requires it.

My understanding (flawed) is that you need to keep all the changes on the server because you never know how long it's been since a client has pulled/pushed changes to a document.

I guess arbitrary limits based on number of updates or time can be imposed, but I haven't seen libraries that do that.

Thanks.

Did you look at hypermerge, also in the automerge org? It is based on DAT protocol and hypercore, and the p2p version of automerge I think (just found the project).

https://github.com/automerge/hypermerge

No I didn't. I'll check it out. Thanks!