Hacker News new | ask | show | jobs
by ctur 1426 days ago
If a "FUSE to replicate SQLite" solution came from anywhere else, I'd be quite skeptical, but there is a lot of very interesting tech coming out of fly.io these days and Ben certainly knows this space well. It still feels a little like a hack and piercing of layers of abstraction (less so than, say, litestream).

I love it when at first glance it isn't clear if a project is a crazy idea from someone just goofing around vs a highly leveraged crazy idea that will be a foundational part of a major technology shift.

I suspect it's the latter and the strategy though is to layer this on top of litestream to create an easy way to use sqlite transparently in a widely distributed multi-node environment (litestream providing the backups and/or readonly replication to remote sites, with LiteFS handling low latency local access in a cluster, POP, or data center).

Cool stuff. It will be fun to see where fly takes this :)

2 comments

Thanks for the vote of confidence! I can understand the "hack" feel -- it's a trade-off. If I wrote it the "proper" way and integrated directly into the SQLite source or used a VFS then it'd be a lot harder to deploy for most folks. By making it a FUSE file system, someone can use it without really knowing much about it from the application's perspective.

As for strategy, it unfortunately doesn't work to layer with Litestream as backups need some strict control over who is the current primary. Instead, I'm adding S3 replication support [1] directly into LiteFS. LiteFS also uses a different transactional file format called LTX so it wouldn't be compatible with Litestream. The LTX format is optimized for compactions so point-in-time restores can be nearly instant.

The end goal isn't much of a secret. We want to let folks spin up nodes in regions across the world, automatically connect to one another, and have the whole thing have the ease of a single node app. We still have a ways to go on that vision but that's what we're going for.

[1] https://github.com/superfly/litefs/issues/18

> I'm adding S3 replication support directly into LiteFS.

Nice! There's a lot of value one can get out of a blob store, despite it appearing seemingly at odds with block-device dependent systems, like most sql dbms.

When a database at BigCloud layered replication (real-time backups) atop S3, they did so by shipping both the WAL and the on-disk files. For write heavy tables, WAL was streamed every second, and on-disk files (snapshots) every 30mins (or at some apt size-based threshold).

While WAL streaming also doubled-up as a key foundation for them to build materialized views, support real-time triggers, and act as an online data-verification layer; S3 itself served as an insurance against hardware errors (memory, cpu, network, disk) and data corruption.

https://web.archive.org/web/20220712155558/https://www.useni... (keyword search S3)

Elasticsearch / OpenSearch does something similar but it only implements snapshot-based replication to S3 (periodic backups).

https://web.archive.org/web/20190722153122/https://www.micro... / https://archive.is/Q5jUj (docs)

> a highly leveraged crazy idea that will be a foundational part of a major technology shift

Has anything other then the Cloud presented a true foundational shift in how applications are built? Kubernetes, Serverless, Blockchain, React, Swift, these things are big but not big enough.

I think we just like pretending every little thing is the next big thing.