|
|
|
|
|
by 0x74696d
682 days ago
|
|
This architecture is roughly how HashiCorp's Nomad, Consul, and Vault are built (I'm one of the maintainers of Nomad). While it's definitely a "weird" architecture, the developer experience is really nice once you get the hang of it. The in-memory state can be whatever you want, which means you can build up your own application-specific indexing and querying functions. You could just use sqlite with :memory: for the Raft FSM, but if you can build/find an in-memory transaction store (we use our own go-memdb), then reading from the state is just function calls. Protecting yourself from stale reads or write skew is trivial; every object you write has a Raft index so you can write APIs like "query a follower for object foo and wait till it's at least at index 123". It sweeps away a lot of "magic" that normally you'd shove into a RDBMS or other external store. That being said, I'd be hesitant to pick this kind of architecture for a new startup outside of the "infrastructure" space... you are effectively building your own database here though. You need to pick (or write) good primitives for things like your inter-node RPC, on-disk persistence, in-memory transactional state store, etc. Upgrades are especially challenging, because the new code can try to write entities to the Raft log that nodes still on the previous version don't understand (or worse, misunderstand because the way they're handled has changed!). There's no free lunch. |
|
That's the basic design that rqlite[1] had for its first ~7 years. :-) But rqlite moved to on-disk SQLite, since with WAL mode, and with 'PRAGMA synchronous=OFF' [2], it is about as fast as writing to RAM. Or at least close enough, and I avoid all the limitations that come with :memory: SQLite databases (max size of 2GB being one). I should have just used on-disk mode from the start, but only now know better.
(I'm guessing you may know some of this because rqlite uses the same Raft library [3] as Nomad.)
As for the upgrade issue you mention, yes, it's real. Do you find it in the field much with Nomad? I've managed to introduce new Raft Entry types very infrequently during rqlite's 10-years of development, only once did someone hit it in the field with rqlite. Of course, one way to deal with it is to release a version of one's software first that understands the new types but doesn't ever write the new types. And once that version is fully deployed, upgrade to the version that actually writes new types too. I've never bothered to do this in practise however, and it requires discipline on the part of the end-users too.
[1] https://www.rqlite.io
[2] This might sound dangerous but in the current design of rqlite, the underlying SQLite database is completely rebuilt from the Raft log on startup (which is fsync'ed on every write). So any corruption of the SQLite database due power loss, etc is moot since the SQLite database is not the authoritative store of data in rqlite.
[3] https://github.com/hashicorp/raft