Hacker News new | ask | show | jobs
by ryao 3659 days ago
> Try 5, 10, 20 megabyte small.

OpenWRT/LEDE will happily work on a system with 4MB of storage:

https://www.lede-project.org

QNX had a graphical environment, a web browser, a web server, a text editor, image viewer, various games, a package manager, etcetera on a 1.44MB floppy:

http://m.youtube.com/watch?v=K_VlI6IBEJ0

Less is definitely more, but you do not need a Unikernel to achieve such sizes and you lose observably by going with a Unikernel. If something goes wrong with your application such as it becoming non-responsive, you need to attach gdb or get a core dump like a kernel developer would to understand what happened. Your production systems that are likely EC2 instances that lack such functionality, which means debugging is much harder with a unikernels than it would have been with a monolithic, hybrid or micro kernel. Furthermore, disk space is cheap, which is why few opt for OpenWRT/LEDE over more full featured Linux distributions in datacenters.

If you want the experience of a single address space and little more code than your application, you could run FreeDOS, which also fits on a floppy and has a code base that is mature. There are guides for doing this online. Here is one for doing a web server:

http://www.instructables.com/id/Retro-dos-web-server/?ALLSTE...

The world moved away from such designs because the observability and stability were awful. We might have "safe" languages now that improve stability of the application, but those could just run as a process in an environment where proper debugging can be done when something goes wrong. The few percentage points of performance that you get from eliminating the mechanisms that enable you to understand what went wrong do not out justify discarding them.

Also, you lose the advantage of a shared memory pool with unikernels, which are generally intended to run in VMs. Partitioning memory in VMs causes internal fragmentation, which artificially lowers the density of applications per machine. It also can lower block IO efficiency from double caching between the host and guest. Hardware virtualization is a useful technology, but it is an inefficiency that we need to eliminate with containers, rather than one that we should to embrace with unikernels.

4 comments

> The world moved away from such designs because the observability and stability were awful. We might have "safe" languages now that improve stability of the application, but those could just run as a process in an environment where proper debugging can be done when something goes wrong. The few percentage points of performance that you get from eliminating the mechanisms that enable you to understand what went wrong when things go wrong does not out justify discarding them.

I think there is a lot of design space here that is unexplored, so I'm not so sure it is as clear cut as you say. You might like this talk given earlier this year at Compose Conference, entitled "Composing Network Operating Systems" (I was a speaker at Compose and I <3'd this talk a lot.)

https://www.youtube.com/watch?v=uXt4a_46qZ0

It is not just about performance in all cases. Mirage is the particular case in question here - but with OCaml functors, it becomes possible to compose components of kernel in truly modular ways. I was continuously surprised by this talk.

Something that needs to write to a block device only needs an abstract functor describing the interface to the device and some primitives to read or write to it. There are many implementations of this interface.

This seems quite obvious but it allows powerful ideas. For example, in the talk, you can see examples similar to this. But what if you want to test your kernel? You can simply substitute in a new implementation that has failure modes. You can write a block device that randomly ignores every 100th write; one that has unexpectedly high latencies, one that outright hangs on all I/O requests... Doing this kind of fault injection today is possible, but it's conceptually a lot nicer if it's just a "Mock" at the "block device" level that you can easily control and extend. You can do all kinds of other things; like have your system timer freak out, skew in random ways, run in reverse.

You mention observability, but when your systems are truly modular, this is nothing more than an obvious follow up. An example in the talk is interposing "Irmin", which is a distributed, Git-esque storagre system, into the network subsystem of your kernel driver. Any time interface properties of the device change, you write entries into the append-only Irmin log which are distributed. Irmin also has a git interface for read-only analysis.

The short story is that means in the talk, there is a live example where you can query a git repository to get a read-only changelog of all the networking state in your application. In the particular example, I believe it was interposed into the ARP implementation; every ARP packet and ARP response was logged into Irmin, and every system change propagated as a result was logged too. This gives you really amazing levels of persistent analysis and introspection with very low developer cost. It's true you could do something similar in a system today; but this is truly modular, works for any application built to use a particular Functorised-API, etc. It's a programming interface! And in theory there's also nothing stopping conventional tools like `ocamldebug` from working either.

Mirage also abstracts over the true underlying runtime. So that same device API can be switched with one that just talks to a POSIX-compliant filesystem, you get an ELF executable, etc. This all works on normal systems too; Unikernels are merely a different deployment target (for the most part).

This is not to say that Unikernels are the future or we should abandon our stable systems we have now (I definitely won't be doing so anytime in the future). But I found myself very surprised at what was quite easily possible, and I wouldn't so quickly write it all off as a fad. Maybe for Huge Enterprise, yeah... Operations experience separate from development is very useful, and a lot easier to find. But there's definitely some really cool uses for these things, especially in helping rethink and improve on some previous ideas.

> Also, you lose the advantage of a shared memory pool with universals, which run in VMs. Partitioning memory in VMs causes internal fragmentation, which lowers densities. It also can cause double caching between the host and guest, which lowers block IO efficiency.

This is a good point that's often overlooked. But I don't look to Unikernels for outright performance, either; to me, they are more interesting for researching newer operating system designs with a much better ROI than previous methods. I'm glad to see that happening, personally. And I might even take a performance loss if it meant winning some other guarantees in return.

Unikernel proponents seem to assume that hardware virtualization will forever be the abstraction of cloud computing. However, hardware virtualization is the wrong abstraction, which is why the industry is beginning to adopt containers. There is no reason why you cannot run a unikernel in UNIX binary mode inside a container, but then it is really just a different way of developing a userland process rather than a unikernel. You definitely could still call it a unikernel. You would get the advantages of modularity that you specified and you would have all of the debuggability and observability that regular applications have today with the tools that we have today. However, that is rather different than the role in which they are intended to operate.

I guess my point is that the unikernel is always going to be the equivalent of a userland process. The question is whether your bare-metal kernel is going to be a traditional one or a hypervisor. They have definite performance advantages over a traditional kernel when your bare metal kernel is a hypervisor, but I believe that is the wrong abstraction when I consider overhead.

>> Try 5, 10, 20 megabyte small.

> OpenWRT/LEDE will happily work on a system with 4MB of storage:

> https://www.lede-project.org

> QNX had a graphical environment, a web browser, a web server, a text editor, image viewer, various games, a package manager, etcetera on a 1.44MB floppy

This is all true, but you skipped the sentence following the one you quoted: "Depending on your needs, you can go down into the kilobyte range - and that's not just the app - that's everything".

A microkernel could go into the kilobyte range too. seL4 is definitely in that range. The smallest Linux kernel bzImage that I ever compiled was something like 570KB, so embedded Linux might be able to reach that range too. That would of course include an application. The QNX demo likely could reach such sizes too if most of the things in it were removed. For those that are unaware, QNX is a microkernel based system.
seL4 doesn't provide very much to the programs, does it? I think it closer in spirit to Xen than to the linux kernel?
The Xen hypervisor uses a microkernel architecture. A kernel could implement a userland (e.g. traditional UNIX), a VM interface (e.g. KVM) or nothing at all (e.g. a unikernel). My point is that unikernels do not have a monopoly on small sizes and they are not worth mentioning as an advantage.
No, when your service that's running on 10k machines do not behave as expected, you do not attach a debugger to it nor dump a core.

UNIX is a great OS to share a machine amongst many users and many programs. It's not that great when your app is made of thousands of asynchronous programs. Last decade tools are to be reinvented regardless of microkernels.

An oft-overlooked advantage is the ability to specify a holistic system with a fine-grain of accuracy.

With a monolithic kernel in the way you have to make some black-box concessions.

Any kind of modularity will result in people treating modules as black boxes, even when they are not. If you use a monokernel/hybrid that is OSS, then there might be plenty of code and you might treat it as a black box, but it is not a black box. If you use a microkernel, then the amount of code is likely even less. Microkernels exist that are formally verified and while unikernels cannot be formally verified without formally verifying the application that is a part of them. Formally verifying an application would be awesome, but few would ever do that.