Hacker News new | ask | show | jobs
by kavinsood 99 days ago
Hey Daniel, It is so awesome to see you here.

1. Spot on. This is the ceiling of text-based CRDTs. Since we last spoke, I fixed the structural side of renames by moving path authority onto stable IDs, but links inside the note body are still plain text, so concurrent rename-driven rewrites can duplicate.

I realised that this problem is uniquely painful in Obsidian because of the "Automatically update internal links" setting. Since people use obsidian as PKM, the app itself is making machine-edits. It turns this CRDT edge-case into a guaranteed anomaly, which is bad.

Notion can make this work because of their AST based DB afaik. I'm sure you've heard of Ink & Switch's Peritext but that's quite experimental (sidenote: keyhive by them is a possible solution to marrying E2EE and CRDTs).

I'm basically accepting this tradeoff semantic intent-loss in exchange for simplicity.

2. I love the 'intent fidelity spectrum' framing. What I have today is a good solution to the 'mechanical filesystem-bridge' problem - trailing-edge coalescing, self-echo suppression, and active-editor recovery, but not yet a full answer to the semantic merge problem.

Though, if I had to implement merge with LCA, I'd have to store historical snapshots locally per file. Currently, I'm not sharding Yjs per file, so that'd be quite inefficient. Though relay could easily instantiate a ghost (I see the wisdom in your architecture here!)

But also, LCA would halt on hard conflicts, taking away from the core promise of a CRDT. I think what UX is better (LCA or not) is debatable, but you cover the bases with DMP and conflict markers.

3. Ah, a competing sync layer is still the classic "please don't do that" configuration.

I retain tombstones for anti-resurrection correctness so they can blow up (though i'm exploring an epoch-fenced vacuum for tombstone GC). I do have automatic daily snapshots with recovery UI built into the plugin, that would be my best answer.

..

Mentally, a blocker for me to refactor to sharded Yjs is large offline cross-file structural changes like folder renames, do you try to preserve a vault-level consistency boundary, or do you let the file docs converge independently and hide the intermediate tearing?

I can tell that you've spent a lot of time in the deep end. I’ll bump our email thread too, would love to compare scars.

1 comments

We let docs converge independently. This is a problem for bases in the current sync engine, but something we're resolving soon with "continuous-background-sync". I think it is also more scalable and matches the file model better.

We landed on folder-level sync rather than vault-level sync, so we have a map CRDT that corresponds with each shared folder. In our model these CRDTs are the ones that can explode, whereas the doc-level ones can kind of be fixed up by dragging it out of the folder and back in again which grabs a new "inode" for it.

If I were to start again I think I'd try to build a file-based persistence layer based on prolly-trees to better adhere to the file-over-app philosophy.