| > The world moved away from such designs because the observability and stability were awful. We might have "safe" languages now that improve stability of the application, but those could just run as a process in an environment where proper debugging can be done when something goes wrong. The few percentage points of performance that you get from eliminating the mechanisms that enable you to understand what went wrong when things go wrong does not out justify discarding them. I think there is a lot of design space here that is unexplored, so I'm not so sure it is as clear cut as you say. You might like this talk given earlier this year at Compose Conference, entitled "Composing Network Operating Systems" (I was a speaker at Compose and I <3'd this talk a lot.) https://www.youtube.com/watch?v=uXt4a_46qZ0 It is not just about performance in all cases. Mirage is the particular case in question here - but with OCaml functors, it becomes possible to compose components of kernel in truly modular ways. I was continuously surprised by this talk. Something that needs to write to a block device only needs an abstract functor describing the interface to the device and some primitives to read or write to it. There are many implementations of this interface. This seems quite obvious but it allows powerful ideas. For example, in the talk, you can see examples similar to this. But what if you want to test your kernel? You can simply substitute in a new implementation that has failure modes. You can write a block device that randomly ignores every 100th write; one that has unexpectedly high latencies, one that outright hangs on all I/O requests... Doing this kind of fault injection today is possible, but it's conceptually a lot nicer if it's just a "Mock" at the "block device" level that you can easily control and extend. You can do all kinds of other things; like have your system timer freak out, skew in random ways, run in reverse. You mention observability, but when your systems are truly modular, this is nothing more than an obvious follow up. An example in the talk is interposing "Irmin", which is a distributed, Git-esque storagre system, into the network subsystem of your kernel driver. Any time interface properties of the device change, you write entries into the append-only Irmin log which are distributed. Irmin also has a git interface for read-only analysis. The short story is that means in the talk, there is a live example where you can query a git repository to get a read-only changelog of all the networking state in your application. In the particular example, I believe it was interposed into the ARP implementation; every ARP packet and ARP response was logged into Irmin, and every system change propagated as a result was logged too. This gives you really amazing levels of persistent analysis and introspection with very low developer cost. It's true you could do something similar in a system today; but this is truly modular, works for any application built to use a particular Functorised-API, etc. It's a programming interface! And in theory there's also nothing stopping conventional tools like `ocamldebug` from working either. Mirage also abstracts over the true underlying runtime. So that same device API can be switched with one that just talks to a POSIX-compliant filesystem, you get an ELF executable, etc. This all works on normal systems too; Unikernels are merely a different deployment target (for the most part). This is not to say that Unikernels are the future or we should abandon our stable systems we have now (I definitely won't be doing so anytime in the future). But I found myself very surprised at what was quite easily possible, and I wouldn't so quickly write it all off as a fad. Maybe for Huge Enterprise, yeah... Operations experience separate from development is very useful, and a lot easier to find. But there's definitely some really cool uses for these things, especially in helping rethink and improve on some previous ideas. > Also, you lose the advantage of a shared memory pool with universals, which run in VMs. Partitioning memory in VMs causes internal fragmentation, which lowers densities. It also can cause double caching between the host and guest, which lowers block IO efficiency. This is a good point that's often overlooked. But I don't look to Unikernels for outright performance, either; to me, they are more interesting for researching newer operating system designs with a much better ROI than previous methods. I'm glad to see that happening, personally. And I might even take a performance loss if it meant winning some other guarantees in return. |
I guess my point is that the unikernel is always going to be the equivalent of a userland process. The question is whether your bare-metal kernel is going to be a traditional one or a hypervisor. They have definite performance advantages over a traditional kernel when your bare metal kernel is a hypervisor, but I believe that is the wrong abstraction when I consider overhead.