Hacker News new | ask | show | jobs
by sorbits 2521 days ago
The advantage of microkernels is that they can be extended with “untrusted” code like hardware drivers or file systems. This runs in user space and thus any bugs in such code will not crash the kernel process.

So I agree with you that Linus is presenting a straw man and your comment shouldn’t have been downvoted.

3 comments

> The advantage of microkernels is that they can be extended with “untrusted” code like hardware drivers or file systems. This runs in user space and thus any bugs in such code will not crash the kernel process.

Did this advantage play out in practice? If your filesystem module goes down then every module that talks to the file system module needs to gracefully handle the failure or it will still effectively crash the system.

Or the module core dumps and the system keeps chugging on, but everything is locked up because they're waiting for the return from the crashed module. Did MINIX have a way to gracefully restart crashed modules?

> Did this advantage play out in practice? If your filesystem module goes down then every module that talks to the file system module needs to gracefully handle the failure or it will still effectively crash the system.

If the file system process crashes then in theory the OS would simply relaunch it.

But your core services should be stable, it’s more about extensions, for example you may want to have virtual file systems (ftp, sshfs, etc.), which until FUSE wasn’t possible in the non-microkernel world.

As for how it played out in practice: I think microkernels lost early on because of performance and things like FUSE were created to allow the most obvious extension mechanisms for the otherwise non-extendable monolithic kernels.

That's the theory yes, but I was asking about real life. Did those early microkernel systems actually deliver?

Also, for anything stateful, like a filesystem, simply relaunching it may not be sufficient. You need to make sure it hasn't lost any data in the crash and possibly rewind some state changes in related modules.

> That's the theory yes, but I was asking about real life. Did those early microkernel systems actually deliver?

According to Wikipedia “[MINIX] can also withstand driver crashes. In many cases it can automatically restart drivers without affecting running processes. In this way, MINIX is self-healing and can be used in applications demanding high reliability”.

While this kernel was originally written to teach kernel design, all Intel chipsets post-2015 are running MINIX 3 internally as the software component of the Intel Management Engine.

Another widely deployed microkernel is L4, I assume this has similar capabilities, as it also puts most things in user space and is used for mission critical stuff.

> Also, for anything stateful, like a filesystem, simply relaunching it may not be sufficient.

True, but simply rebooting when the kernel process crashes due to buggy driver code won’t be sufficient either :)

FYI when Apple introduced extended attributes their AFP (network file system) did have a bug that made the kernel (and thus entire machine) crash for certain edge cases involving extended attributes.

In that case, had their AFP file system been a user space process, I may still have lost data, but it would have saved me from dozens of reboots.

My nvidia driver regularly hangs my system every ~90 minutes or so, so I can certainly empathize with the goals & vouch that they still have a role today.
Please note that Linus wrote an operating system that in practice showed greater reliability than competing commercial microkernels. I do not believe that the principles that he came to believe in that process should be dismissed as straw man arguments.
> […] showed greater reliability than competing commercial microkernels

What is your basis for this claim?

I am only aware of QNX as a commercial microkernel (and real-time OS) and that is widely used in cars, medical devices, etc. with a strong reputation for reliability.

But for many tasks, Linux is good enough and free, which is hard to beat. But that does not mean that Linus is automatically correct in his statements.

According to the public advertising at the time, Windows NT, GNU Hurd, and Mach were all designed as microkernels. Mach of course is the basis for OS X.

At the same time that Windows NT was being claimed as a microkernel, Linux was outperforming and had a reputation as being more reliable. Ditto with Mach. And GNU Hurd famously was hard to get running at all.

QNX is highly reliable, but is also a specialized use case.

Source? Speed I can imagine, but not reliability.
Tell that to my (lack of) graphics drivers. You can say its political but as it stands its no where near apples to apples in terms of what Windows supports vs what Linux supports.
Which video card do you have that lacks drivers for Linux? Or do you need fully open source drivers that fully support 3D acceleration and computation?
And yet somehow Linux manages to run on a greater variety of hardware than Windows does.

I am of course including supercomputers, embedded hardware, and hand-held phones. Admittedly Windows has greater support for is running consumer hardware for desktops. But that has to do with how small the Linux marketshare is. And is hardly an indictment of Linus' work.

It's not an indictment at all. I'm just pointing out that there is no apples to apples comparison and it's misleading to imply there is.
So similar in principle to FUSE, but applied more broadly? Seems like a neat idea.
In practice it was less useful than people assumed, because:

1. Things like drivers and filesystems are usually written by a small handful of vendors, who already have rigorous engineering cultures (hardware is a lot less forgiving than say web design), and a large base of demanding users who will rapidly complain and/or sue you if you get it wrong. When was the last time you personally had a crash due to a driver or filesystem issue? It used to happen semi-frequently in the Win95 days, but there was a strong incentive for hardware manufacturers to Fix Their Shit, and so all the ones who didn't went out of business.

2. You pay a hefty performance price for that stability - since the kernel is mediating all interactions between applications and drivers, every I/O call needs to go from app to kernel to driver and back again. There's been a lot of research put into optimizing this, but the simplest solution is just don't do this, and put things like drivers & filesystems in the kernel itself.

3. The existence of Linux as a decentralized open-source operating system took away one of the main organizational reasons for microkernels. When the kernel is proprietary, then all the negotiations, documentation, & interactions needed to get code into an external codebase become quite a hassle, with everyone trying to protect their trade secrets. When it's open-source, you go look for a driver that does something similar, write yours, and then mail a patch or submit a pull request.