| > "Likely the idea that filesystems should run as userspace / unprivileged (or at least limited privilege) processes which would make them, ultimately, indistinguishable from a form of database engine." "Userspace vs not" is a different argument from "consistency vs not" or "atomicity vs not" or "POSIX vs not". Someone still needs to solve that problem. Sure instead of SQLite over POSIX you could implement POSIX over SQLite over raw blocks. But you haven't gained anything meaningful. > Persistent file systems are essentially key-value stores I think this is reductive enough to be equivalent to "a key-value store is a thin wrapper over the block abstraction, as it already provides a key-value interface, which is just a thin layer over taking a magnet and pointing it at an offset". Persistent filesystems can be built over key-value stores. This is especially common in distributed filesystems. But they also circumvent a key-value abstraction entirely. > IMO a big problem with POSIX filesystems is the lack of atomicity Atomicity requires write-ahead logging + flushing a cache. I fail to see why this needs to be mandatory, when it can be effectively implemented at a higher layer. > This and a complete lack of consistent networked API A consistent networked API would require you to hit the metadata server for every operation. No caching. Your system would grind to a halt. Finally, nothing in the POSIX spec prohibits an atomic filesystem or consistency guarantees. It is just that no one wants to implement these things that way because it overprovisions for one property at the expense of others. |
This was an attempt to possibly explain the microkernel point GP made, which only really matters below the FS.
> I think this is reductive enough to be equivalent to "a key-value store is a thin wrapper over the block abstraction, as it already provides a key-value interface, which is just a thin layer over taking a magnet and pointing it at an offset".
I disagree with this premise. Key-value stores are an API, not an abstraction over block storage (though many are or can be configured to be so). File systems are essentially a superset of a KV API with a multitude of "backing stores". Saying KV stores are always backed by blocks is overly reductive, no?
> Atomicity requires write-ahead logging + flushing a cache. I fail to see why this needs to be mandatory, when it can be effectively implemented at a higher layer.
You're confusing durability for atomicity. You don't need a log to implement atomicity, you just need a way to lock one or more entities (whatever the unit of atomic updates are). A CoW filesystem in direct mode (zero page caching) would need neither but could still support atomic updates to file (names).
> A consistent networked API would require you to hit the metadata server for every operation. No caching. Your system would grind to a halt.
Sorry, I don't mean consistent in the ACID context, I mean consistent in the loosely defined API shape context. Think NFS or 9P.
I also disagree with this to some degree: pipelined operations would certainly still be possible and performant but would be rather clunky. End-to-end latency for get->update-write, the common mode of operation, would be pretty awful.
> Finally, nothing in the POSIX spec prohibits an atomic filesystem or consistency guarantees. It is just that no one wants to implement these things that way because it overprovisions for one property at the expense of others.
I didn't say it did, but it doesn't require it which means it effectively doesn't exist as far as the users of FS APIs are concerned. Rename operations are the only API that atomicity is required by POSIX. However without a CAS-like operation you can't safely implement a lock without several extra syscalls.