Hacker News new | ask | show | jobs
by benbjohnson 1426 days ago
Thanks for the vote of confidence! I can understand the "hack" feel -- it's a trade-off. If I wrote it the "proper" way and integrated directly into the SQLite source or used a VFS then it'd be a lot harder to deploy for most folks. By making it a FUSE file system, someone can use it without really knowing much about it from the application's perspective.

As for strategy, it unfortunately doesn't work to layer with Litestream as backups need some strict control over who is the current primary. Instead, I'm adding S3 replication support [1] directly into LiteFS. LiteFS also uses a different transactional file format called LTX so it wouldn't be compatible with Litestream. The LTX format is optimized for compactions so point-in-time restores can be nearly instant.

The end goal isn't much of a secret. We want to let folks spin up nodes in regions across the world, automatically connect to one another, and have the whole thing have the ease of a single node app. We still have a ways to go on that vision but that's what we're going for.

[1] https://github.com/superfly/litefs/issues/18

1 comments

> I'm adding S3 replication support directly into LiteFS.

Nice! There's a lot of value one can get out of a blob store, despite it appearing seemingly at odds with block-device dependent systems, like most sql dbms.

When a database at BigCloud layered replication (real-time backups) atop S3, they did so by shipping both the WAL and the on-disk files. For write heavy tables, WAL was streamed every second, and on-disk files (snapshots) every 30mins (or at some apt size-based threshold).

While WAL streaming also doubled-up as a key foundation for them to build materialized views, support real-time triggers, and act as an online data-verification layer; S3 itself served as an insurance against hardware errors (memory, cpu, network, disk) and data corruption.

https://web.archive.org/web/20220712155558/https://www.useni... (keyword search S3)

Elasticsearch / OpenSearch does something similar but it only implements snapshot-based replication to S3 (periodic backups).

https://web.archive.org/web/20190722153122/https://www.micro... / https://archive.is/Q5jUj (docs)