Show HN: YAOS – A 1-click deploy, real-time sync engine for Obsidian

Y	Hacker News new \| ask \| show \| jobs

5 points by kavinsood 100 days ago

Hey HN,

I'm a heavy Obsidian user.

I recently got tired of the two usual sync tradeoffs:

1. File-based sync (iCloud/Dropbox/Syncthing) that leaves you waiting for changes to propagate, or hands you a "conflicted copy." 2. Self-hosted setups (like CouchDB) that need touching VMs and dockerized databases to sync markdown.

So I built YAOS: a local-first, real-time sync engine for Obsidian.

Self-hosting OSS should have better UX.

You can deploy the backend to your own Cloudflare account in one click. It fits comfortably in Cloudflare's free tier (costing $0/month for normal personal use), and requires absolutely no terminal interaction, no SSH, and no env files.

You can try it out right now: https://github.com/kavinsood/yaos

How it works under the hood:

- Text sync uses Yjs CRDTs. It syncs real-time keystrokes and cursors rather than treating the vault as a pile of files to push around later. - Each vault maps to a Cloudflare Durable Object, giving you a low-latency, single-threaded coordinator at the edge. - The backend uses a chunked Checkpoint + Delta-Journal MVCC storage engine on top of the DO's SQLite storage. - Attachments sync separately via R2 (which is optional—text sync works fine without it).

The hardest part of this project was bridging Obsidian's synchronous UI and noisy OS file-watchers with the in-memory CRDT graph. I had to build a trailing-edge snapshot drain that coalesces rapid-fire IO bursts (like running a find-and-replace-all) into atomic CRDT transactions to prevent infinite write-loops.

The current design keeps a monolithic CRDT per vault. This is great for normal personal notes, but has a hard memory ceiling (~50MB of raw text). I chose this tradeoff because I cared more about fast, boringly reliable real-time ergonomics than unbounded enterprise scale.

I also wrote up engineering notes on the tricky parts (like handling offline folder rename collisions without resurrecting dead files) on GitHub.

I've spent the last three weeks doing brutal QA passes to harden mobile reconnects, IndexedDB quota failures, and offline split-brain directory renames.

I'd love feedback on the architecture, the code, or the trade-offs I made. I'll be hanging out in the thread to answer questions!

2 comments

dtkav 100 days ago

Great work Kavin!

This is a super interesting space, and lots of fun and difficult problems to tackle.

A few trailheads of interesting complexity:

1. Concurrent machine edits - in particular handling links to renamed files across devices. This is a case where CRDTs fall over because they converge but are not idempotent. For example renaming a file [[hello 1]] to [[hello 2]] when multiple devices are online can result in [[hello 22]] because deletes merge before inserts.

2. Ingesting disk edits in the age of claude code. The intended behavior can change based on what I'm calling the "intent fidelity spectrum". I've been using that spectrum as a guide for when to apply merges in "text space" vs. "crdt space", including sometimes withholding ops based on origin (e.g. from obsidian processFile calls), cancelling them) or offline status. For example, if you made edits while offline and have a least-common-ancestor you may be able to look for conflicts via diff3 and then conditionally use diff-match-patch if there are no conflicts, or surface the conflict to the user if there's not a good merge strategy based on the low levels of intent.

3. History and memory management - how do you recover state if a user has a competing sync service which causes an infinite loop in file creation/deletion. This can be difficult with CRDTs because the tombstones just keep syncing back and forth between peers and can be difficult to clear. This is significantly worse if you use Y.PermanentUserData (do not recommend...).

link

kavinsood 99 days ago

Hey Daniel, It is so awesome to see you here.

1. Spot on. This is the ceiling of text-based CRDTs. Since we last spoke, I fixed the structural side of renames by moving path authority onto stable IDs, but links inside the note body are still plain text, so concurrent rename-driven rewrites can duplicate.

I realised that this problem is uniquely painful in Obsidian because of the "Automatically update internal links" setting. Since people use obsidian as PKM, the app itself is making machine-edits. It turns this CRDT edge-case into a guaranteed anomaly, which is bad.

Notion can make this work because of their AST based DB afaik. I'm sure you've heard of Ink & Switch's Peritext but that's quite experimental (sidenote: keyhive by them is a possible solution to marrying E2EE and CRDTs).

I'm basically accepting this tradeoff semantic intent-loss in exchange for simplicity.

2. I love the 'intent fidelity spectrum' framing. What I have today is a good solution to the 'mechanical filesystem-bridge' problem - trailing-edge coalescing, self-echo suppression, and active-editor recovery, but not yet a full answer to the semantic merge problem.

Though, if I had to implement merge with LCA, I'd have to store historical snapshots locally per file. Currently, I'm not sharding Yjs per file, so that'd be quite inefficient. Though relay could easily instantiate a ghost (I see the wisdom in your architecture here!)

But also, LCA would halt on hard conflicts, taking away from the core promise of a CRDT. I think what UX is better (LCA or not) is debatable, but you cover the bases with DMP and conflict markers.

3. Ah, a competing sync layer is still the classic "please don't do that" configuration.

I retain tombstones for anti-resurrection correctness so they can blow up (though i'm exploring an epoch-fenced vacuum for tombstone GC). I do have automatic daily snapshots with recovery UI built into the plugin, that would be my best answer.

Mentally, a blocker for me to refactor to sharded Yjs is large offline cross-file structural changes like folder renames, do you try to preserve a vault-level consistency boundary, or do you let the file docs converge independently and hide the intermediate tearing?

I can tell that you've spent a lot of time in the deep end. I’ll bump our email thread too, would love to compare scars.

link

dtkav 99 days ago

We let docs converge independently. This is a problem for bases in the current sync engine, but something we're resolving soon with "continuous-background-sync". I think it is also more scalable and matches the file model better.

We landed on folder-level sync rather than vault-level sync, so we have a map CRDT that corresponds with each shared folder. In our model these CRDTs are the ones that can explode, whereas the doc-level ones can kind of be fixed up by dragging it out of the folder and back in again which grabs a new "inode" for it.

If I were to start again I think I'd try to build a file-based persistence layer based on prolly-trees to better adhere to the file-over-app philosophy.

link

arabinda 100 days ago

obsidian sync is genuinly one of those things people complain about constantly, interested to see how the conflict resolution works when edits happen on multiple devices simultaneously

link

kavinsood 100 days ago

Conflicts mathematically don't exist in YAOS because it uses Yjs (a CRDT) under the hood.

This is not the traditional 'last-write-wins' conflict resolution (where it detects a collision and asks you to pick a winner).

If two devices are editing the same note simultaneously, keystrokes are broadcasted over WebSockets to a Cloudflare Durable Object, Yjs merges the edits concurrently. You will see the remote cursors flying around.

Even if the devices are offline, the edits are saved locally to IndexedDB. When devices reconnect, they compute a state vector difference, exchange the missing binary deltas, and deterministically converge on the exact same text. No conflict files are ever generated.

I wasn't looking to make this into a paid service, rather a OSS plugin. The important thing for me was that self-hosting this shouldn't involve docker, VMs, etc. The DX with Cloudflare is pretty nice.

link

dtkav 100 days ago

IMO Obsidian Sync is a fantastic solution for e2ee device sync in Obsidian. It is a good/honest business model to fund the development of Obsidian.

What complaints are you hearing?

link