Hacker News new | ask | show | jobs
by anonymous-panda 832 days ago
Directories make up a hierarchical filesystem, but it’s not a necessary condition. A filesystem at its core is just a way of organizing files. If you’re storing and organizing files in s3 then it’s a filesystem for you. Saying it’s “fundamentally a key value store” like it’s something different is confusing because a filesystem is just a key value store of path to contents of file.

Indeed there’s every reason to believe that a modern file system would perform significantly faster if the hierarchy was implemented as a prefix filter than actually maintaining the hierarchical data structures (at least for most operations). You can guess that this might be the case that file creation is extremely slow on modern file systems (on the order of hundreds or maybe thousands per second on a modern NVME disk that can otherwise do millions of IOPs and listing the contents of an extremely large directory is exceedingly slow)

3 comments

In context of the comment I was addressing, it’s clear that filesystem means more than just a key value store. I’d argue that this is generally true in common vernacular.
This is a technical website discussing the nuances of filesystems. Common vernacular is how you choose to define it but even the Wikipedia definition says that directories and hierarchy are just one property of some filesystems. That they became the dominant model on local machines doesn’t take away from the more general definition that can describe distributed filesystems.
I'm kind of chuckling at this thread because you're working so hard to not understand.

I think the previous poster could/should have said, "It is not a hierarchical file system and has no concept of directories." where I added the word "hierarchical".

But it's also pretty obvious that was the point.

I disagree with that characterization because the contrast by OP was that S3 is “just a KV store implying” it doesn’t meet the criteria for being considered a filesystem.

For example, you could implement POSIX directory semantics on top of S3. About the only POSIX filesystem API you couldn’t implement it append / overwrite (well you could but it might be prohibitively expensive).

A real hierarchy makes global constraints easier to scale, e.g. globally unique names or hierarchical access controls. These policies only need to scale to a single node rather than to the whole namespace (via some sort of global index).
no - a filesystem implementation on an ordinary OS has more than what you mention, including interfaces to disk device drivers