Hacker News new | ask | show | jobs
by throwaway09223 1793 days ago
Yes but a filesystem is also a hierarchical database.

A filesystem solves these issues specifically because it avoids reimplementation. As the registry has been extended as you say it approaches parity with filesystem functionality, but on a parallel track.

At a high level, avoiding multiple implementations of similar metaphors is ideal in terms of security. Reuse what you have.

1 comments

I'd agree a filesystem is also a type of hierarchical database but the author doesn't think so:

"Back to point 1, the Registry is a half-assed, poor quality implementation of a filesystem. Importantly, it’s not a database. It should be a database!"

Noting "not a database" is bolded.

Sure, and I would agree with you here.

These are the kinds of categorizations that people can go nuts over. Rather than get too hung up on words I'd say that whatever this is, it can effectively be represented by a filesystem and therefore it should be as a matter of general architecture and security principle.

I'm actually with the author that if it were going to be rewritten a freshly written columnar database would be way more efficient than representing it as a filesystem but that either would be better than what we have after 30 years. I just don't think "it wasn't a filesystem originally" has much to do with why it's so crap now. Similar case: posix specifies network sockets be accessed as files/filesystems (as most everything in posix is) but nobody actually used that representation because it's inefficient even though it's the standard and easily mappable to files/filesystems. Well I think Solaris actually allows both but the point stands.
Sorry, I'm unfamiliar with what you mean by "network sockets be accessed as files." Do you mean unix domain sockets? These are in fact commonly used and they're certainly no less efficient (more efficient in many ways, in fact).

UDS are interfaced with via the same berkeley sockets api, not via the filesystem api. Have you ever written applications that use them?

I don't mean unix domain sockets, those are known as IPC sockets. The berkeley sockets API you are familiar is actually exactly what I was talking about. It does offer both types of sockets (the other being network sockets as I originally mentioned) but it uses handles in an abstract namespace not files in a filesystem (e.g. in Linux it's still a FD but it doesn't map to an actual file on a filesystem it's just a unique handle in its own namespace).

What I was referring to were things like /dev/tcp/ and /dev/udp you'll find on Solaris (or emualated via bash on most systems) which are actual filesystem paths instead of handles in abstract namespace. A usage example of this comparable to binding to a socket with the BSD API to udp://localhost:2048 would be "echo "example" > /dev/udp/localhost/2048". The actual I/O is through the standard file/filesystem interface just like /dev/random. It's not the best for network sockets though so they tend to get a raw handle in every modern OS, even if it does mean rebuilding the wheel on some other things.

Network sockets are the canonical example of "not everything in Unix is a file". "Everything is a FD" is true but "everything is a handle" is true on any OS design, the uniqueness that things like ram and disks are just files under / did not hold true with networking.

And yes I have written plenty of apps with ipc sockets and network sockets and raw sockets and even underlying device access (for things like custom Ethernet packets). I'm in networking by profession.

I think there's some confusion about how the sockets api works, let me see if I can clear this up.

Posix does not specify that network sockets should be accessed by file paths. It's possible to do so, but unspecified by the standard.

Sockets produced by socket(2) are regular old file descriptors, just as created by open(2) on a file path, or any other descriptor generating syscall like pipe(2) or epoll_create(2). There is no separate representation among any of these -- they are all just file descriptors. There are many, many ways to create descriptors and many aren't associated with a filesystem. There's no efficiency issue here, nor is there a divergence from a consistent pattern.

If you like, you can use fchmod(2) on a descriptor generated by socket(2) and change its permissions. You can track it by its inode. It doesn't matter that the descriptor is not linked to a filesystem, any more than for a similar descriptor created by pipe(2). They all have the same functionality and fit within the same consistent metaphor. When you run grep | grep, the pipe descriptor has permissions, mtime, ctime, atime and the rest. Everything just works.

It's trivial to write a filesystem to expose descriptors, in fact /proc does this already for all descriptor tables across all processes. There's no rebuilding of any wheel - the point of commonality is the "struct file" in linux/fs.h.

There's no such thing as a "raw handle" here, btw. That phrase has no meaning.