Hacker News new | ask | show | jobs
by zamadatix 1789 days ago
The Windows registry is certainly a database, perhaps just not the type you're used to https://en.wikipedia.org/wiki/Hierarchical_database_model

Most of the actual technical issues you list have more to do with it being extended for the last 30 years in a backwards compatible way than anything to do with it being a hierarchical db instead of a filesystem.

4 comments

I still see it as a file system, very similar to NTFS (similar in the sense of having similar features), apart the (recent) project just mentioned (ProjFS) there existed a file system like driver for it, only for the record:

http://reboot.pro/topic/7681-the-registry-as-a-filesystem/

https://web.archive.org/web/20090413131629/http://czwsoft.dy...

https://web.archive.org/web/20140401212651/http://pasotech.a...

And:

https://github.com/jbruchon/winregfs

It probably seems similar because file systems are typically classified as a type of hierarchical db themselves. That being said "I can represent it with a file in a filesystem" is different from "it is a filesystem" in posix (nearly) everything is accessible through the filesystem, even network sockets, it doesn't mean everything's canonical representation is a filesystem it just means it's mappable.

Regardless the point wasn't "a filesystem couldn't represent a rewritten registry" it was that the registry is actually a database today (whether viewed as a file-system like db by the reader or hierarchical db it is listed as) and the rest of the technical problems have to do with it being 30 years old and not rewritten not that it wasn't written with a file system representation as primary view in the first place.

From the "rant" rwmj just posted a link to:

https://rwmj.wordpress.com/2010/02/18/why-the-windows-regist...

>This misses the point: the Registry is a filesystem. Sure it’s stored in a file, but so is ext3 if you choose to store it in a loopback mount. The Registry binary format has all the aspects of a filesystem: things corresponding to directories, inodes, extended attributes etc.

> The major difference is that this Registry filesystem format is half-arsed. The format is badly constructed, fragile, endian-specific, underspecified and slow.

Anyway, file systems and databases are essentially similar, the point revolves more around the poor implementation of the Registry (whatever it is).

I think everyone is in agreement it's bad, as I said:

> Most of the actual technical issues you list have more to do with it being extended for the last 30 years in a backwards compatible way than anything to do with it being a hierarchical db instead of a filesystem.

My first line about it being a database was about point 7 in the same link:

> Back to point 1, the Registry is a half-assed, poor quality implementation of a filesystem. Importantly, it’s not a database. It should be a database!

With "not a database" in bold.

Before I forget, there is also multi-commander that uses the "filesystem approach":

http://multicommander.com/

http://multicommander.com/docs/browse-registry

Technically a file system is just a special database. I think a better formulation of the authors point would be "the registry is a lot like a file system, even though a more traditional database approach or fully embracing it as a file system would have probably worked out better".

Also, they would have been able to at least improve the on-disk format with a major version; I highly doubt that the registry itself is backwards-compatible anyway and there are probably very few programs that access it directly.

That's a really good take on what the author was going for, I appreciate the take! I still disagree that it starting out as a filesystem or database has anything to do with why it's so crap 30 years later but it gets to the crux of the topic much quicker.

With how tightly the APIs for accessing the registry are coupled with the model and encodings of the registry, particularly the driver APIs for it, I don't think it would have been so easy to just swap out the back end without breaking something though (which Windows avoids like the plague) but maybe doable by someone more optimistic than me :). The real "rewrite" was the push for Universal Windows apps using the .NET platform which stores everything for the app in XML files and shadow directories instead of the registry. Of course that didn't take over quite like they hoped so they ended up back with using the registry they were trying to leave 10 years later.

Yes but a filesystem is also a hierarchical database.

A filesystem solves these issues specifically because it avoids reimplementation. As the registry has been extended as you say it approaches parity with filesystem functionality, but on a parallel track.

At a high level, avoiding multiple implementations of similar metaphors is ideal in terms of security. Reuse what you have.

I'd agree a filesystem is also a type of hierarchical database but the author doesn't think so:

"Back to point 1, the Registry is a half-assed, poor quality implementation of a filesystem. Importantly, it’s not a database. It should be a database!"

Noting "not a database" is bolded.

Sure, and I would agree with you here.

These are the kinds of categorizations that people can go nuts over. Rather than get too hung up on words I'd say that whatever this is, it can effectively be represented by a filesystem and therefore it should be as a matter of general architecture and security principle.

I'm actually with the author that if it were going to be rewritten a freshly written columnar database would be way more efficient than representing it as a filesystem but that either would be better than what we have after 30 years. I just don't think "it wasn't a filesystem originally" has much to do with why it's so crap now. Similar case: posix specifies network sockets be accessed as files/filesystems (as most everything in posix is) but nobody actually used that representation because it's inefficient even though it's the standard and easily mappable to files/filesystems. Well I think Solaris actually allows both but the point stands.
Sorry, I'm unfamiliar with what you mean by "network sockets be accessed as files." Do you mean unix domain sockets? These are in fact commonly used and they're certainly no less efficient (more efficient in many ways, in fact).

UDS are interfaced with via the same berkeley sockets api, not via the filesystem api. Have you ever written applications that use them?

I don't mean unix domain sockets, those are known as IPC sockets. The berkeley sockets API you are familiar is actually exactly what I was talking about. It does offer both types of sockets (the other being network sockets as I originally mentioned) but it uses handles in an abstract namespace not files in a filesystem (e.g. in Linux it's still a FD but it doesn't map to an actual file on a filesystem it's just a unique handle in its own namespace).

What I was referring to were things like /dev/tcp/ and /dev/udp you'll find on Solaris (or emualated via bash on most systems) which are actual filesystem paths instead of handles in abstract namespace. A usage example of this comparable to binding to a socket with the BSD API to udp://localhost:2048 would be "echo "example" > /dev/udp/localhost/2048". The actual I/O is through the standard file/filesystem interface just like /dev/random. It's not the best for network sockets though so they tend to get a raw handle in every modern OS, even if it does mean rebuilding the wheel on some other things.

Network sockets are the canonical example of "not everything in Unix is a file". "Everything is a FD" is true but "everything is a handle" is true on any OS design, the uniqueness that things like ram and disks are just files under / did not hold true with networking.

And yes I have written plenty of apps with ipc sockets and network sockets and raw sockets and even underlying device access (for things like custom Ethernet packets). I'm in networking by profession.

>for the last 30 years in a backwards compatible way

There's nothing that has to be backwards compatible in registry internal storage format, they could just design new sane format and keep old API.