Hacker News new | ask | show | jobs
by artlogic 3886 days ago
I think a lot of the common idioms passed around about microkernels stem from the old Torvalds v. Tannenbaum argument [1]. It wasn't that Torvalds was right and Tannebaum was wrong. It was simply that Tannenbaum had reliability and theoretical correctness in mind, while Torvalds was concerned with performance. The less context switching you do, the more performant your computer will be. We can take this to the extreme and look at OSes like TempleOS [2] where EVERYTHING runs in kernel mode. Most people prefer a bit more reliability, and so modern desktop kernels are generally hybrid kernels (with the notable exception of Linux which is, of course monolithic). I don't think there's any real technical reason a microkernel couldn't work on today's hardware which is many orders of magnitude faster than what we had in 1992, but a hybrid kernel is always going to be faster.

[1] https://en.wikipedia.org/wiki/Tanenbaum%E2%80%93Torvalds_deb...

[2] http://www.templeos.org/

2 comments

I always found it interesting how much Intel's protection ring architecture dictated the direction of operating systems back in the 1980s. (I'm sure Intel didn't even invent it; similar concepts were probably already in place in mainframes?)

x86 has highly specific support for protection rings and switching between them, as well as things like page faulting and interrupt management, leading to the classic kernel/user split with a kernel as a privileged actor underneath a user mode.

But having just one exclusive, reserved "kernel mode" is starting to look old, which is why there's now so much talk about virtualization and exokernels and so on. The microkernel design certainly seems very elegant, but it looks to me like Intel's architecture was always a stumbling block. You have to wonder about what hardware support you could invent that would make microkernels a better fit.

Protection rings are a lot older than x86. They were first used on the Honeywell 6180 for Multics back in the 60s, and used in pretty much every sufficiently large processor. The Vax had two-ring layout, as did the 68k, so the seeds for OS development in the 80s were already pretty soundly planted.

The page faulting was also on older systems, because putting those things in hardware is a lot faster than doing those things in software, plus controlling memory access really should be a privileged activity. Interrupt handling is in a similar situation, and even there, you still need some process handling the interrupt vector table. It's possible to make most of an interrupt handler a user-level process through page table and interrupt return address hacking, but for the moment, it's unfortunately rare.

As for your desire for a replacement for one exclusive, reserved kernel mode, there have been a few OSes that have tried to break that pattern. OS/2 used Ring 2 of the x86 for drivers, but unfortunately that bit wasn't added to Windows when they were forking NT. Being able to put semi-trusted drivers in a separate area, and perhaps even a user session manager too, could allow for some interesting security experiments that don't rely on (para)virtualization.

Hardware-wise, it would be useful to have hardware contexts, like sparcs have, so that the group of registers a process has can be swapped in and out a lot easier. Context switching is expensive, and building processors that realize that the modern user tends to have more than one task running would be a pretty good performance win.

The vax had 4 modes: kernel, executive, supervisor, user. I just read[0] that kernel and executive had implicit SETPRV privilege (and thus access to those modes would be tightly controlled because SETPRV allows bypassing all access controls); and I think I heard a while back that normal programs ran in user mode but the shell-equivalent ran in supervisor mode (and [1] sort of implies this).

[0] http://h30266.www3.hp.com/odl/vax/opsys/vmsos73/vmsos73/5841...

[1] "If you created the name with a DCL command, the access mode defaults to supervisor mode. If you created the name with a program, the access mode typically defaults to user mode." http://h71000.www7.hp.com/doc/731final/4477/4477pro_007.html

You're right. The biggest news for operating systems is the fact that the hardware virtualization features on modern processors allow breaking free from the old model.

A modern intel/amd processor lets you virtualize the CPU, MMU, and I/O. Network cards and HBAs can be partitioned into virtual NICs.

Two primary functions of a classic OS are process isolation (i.e. virtual memory) and hardware sharing. But now that these two primary functions have been pushed down into hardware, it has changed the way we think about operating systems considerably.

There isn't much left for a classic OS to do other than provide a common set of APIs for programs to talk through. Yet these days we can statically link even the largest libraries into our exokernels. And hypervisors are capable of using techniques like same-page merging to reduce the memory burden of running many large (exo)kernels at once.

I don't expect the the classic one-OS-running-many-processes model to go away overnight, or possibly ever. But the exokernel model is very compelling for large-scale high performance software services and it will continue to catch on.

Take a look at the Mill's security presentation.

It's an architecture where there is no real border between a monolithical kernel and a microkernel. It's just differences on access control policies, to the point that those names lose their meaning.

As always, it's a very interesting architecture. I hope they produce it someday.

GEMSOS and STOP OS built highly-secure systems (for the time) on Intel by using the protection rings and segments. Both only put the security kernel in kernel mode, user apps in user mode, and OS services in middle rings.

Here's architecture for GEMSOS. See design/assurance sections.

http://aesec.com/eval/NCSC-FER-94-008.pdf

Look up SCOMP Final Evaluation Report if you want to see how STOP OS used four rings and had an IOMMU despite that being "invented" recently. ;) The XTS-400 is the Intel version, uses same architecture minus custom hardware, and is still doing its job at hundreds of installations.

Definitely a major performance hit on both GEMSOS and STOP but they were 80's era stuff. Modern separation kernels do most stuff with just user/kernel mode separation with tiny kernels (4-12kloc). LynxSecure claims helps them keep CPU 97% idle with 100,000+ context switches a second. I'd expect old architectures to run even faster with modern techniques.

> Torvalds was concerned with performance

Performance is not a single metric. There is throughput, latency and then you can screw it all up and make it much harder by demanding guarantees on either of those.

Performance without guarantees is worth very little in quite a few situations.

Good point. The other argument is also true; Security and correctness is not a single metric.

With modern (buggy) hardware and DMA access, when your driver and/or hardware fails all bets are off. Some hardware may be possible to reboot (much as you'd reinitialize a kernel module in Linux), but sometimes your best course of action is a complete reboot.

As for security, you also need to take a long hard look at the the operating systems your operating system relies on, such as the ones powering your disks, nic, pci-controller etc. There are some potential tricky security interactions with them.

SMM and other such ring "-1" type "services" in modern CPUs make your point quite clear to anyone who digs deep enough.

When trying to secure a system, we have reached the point where you have to sometimes as "is this CPU opcode safe?" Sometimes it just feels like modern hardware complexity is reaching some kind of critical mass threshold for "stupid shit"

That point was reached back in 90's when first security evaluations of Intel architecture were done, found tons of black boxes like SMM, and said to ditch it for security or virtualization. Invisible Things did a good job demonstrating an old risk but people should've ditched it long ago.

If you want verifiable hardware, look up the VAMP processor as it has everything from design descriptions to formal proofs of correctness. Not sure about its availability. SPARC and RISC-V are very open with open-source implementations available with Linux and compiler support. So, there's a solution if people ever want to put the work in.