I think micro VMs solved a lot of the issues that people had with regular containers and that unikernels were going to fix. There is still probably a performance improvement to be had with unikernels, but not enough to throw away all the investment companies made into containers.
That said, there are options for running unikernels as K8s workloads if you want, eg NanoVMs: https://docs.ops.city/ops/k8s
https://github.com/google/gvisor gives you essentially the same benefits as a unikernel without having to compromise on compatibility or recompile your apps, and integrates nicely with Kubernetes already. It also doesn't require a hypervisor at all.
I always thought of gVisor as being in the opposite direction of the main point of unikernels in that it's another layer between the app and the kernel rather than removing the separation completely.
Unikernels never really removed the layer between the app and the kernel, they just made the hypervisor the kernel and invented a layer to handle IO to/from the virtual devices presented by the hypervisor, inside the same memory space as the app.
If the hypervisor is KVM, which they are if running on modern AWS EC2 instances or GCP, unikernel apps are literally just Linux processes; the underlying Linux host is doing all the heavy lifting. Conceptually, they're essentially the same as a sandboxed ordinary Linux process with an in-process IO stack, but without the ability to monitor or debug them as if they were an ordinary Linux process.
Prometheus and other open source monitoring solutions work out of the box and we even have a custom APM service that is unikernel-tuned https://nanovms.com/radar .
Even though Hyper-V is also a type-1 hypervisor in terms of CPU execution something still needs to mediate the virtual devices to the physical hardware and that's done by the hypervisor's kernel. In Hyper-V's case that is NT which mediates the vNIC with the virtual switch and physical NIC "uplink".
Some devices they can also support hardware assisted virtualization like for PCIe devices (NICs/NVMe storage/GPUs) via SR-IOV but it's been pretty rare to see that in practice with unikernels as they typically have limited physical device driver support on top of that not really being an option everywhere all the time as it places limitations on the cloud provider that paravirtual devices don't.
Can you explain this? Afaict hyper-v is the same as VMware or virtual box where you have a host OS and multiple guest OSes (which makes sense because you still need something to run the OS drivers). It sounds like what you’re implying is it behaves differently but I’m not sure how. Can you elaborate?
Windows runs as guest OS on top of Hyper-V as well, it is a type 1 hypervisor.
Basically when you activate Hyper-V, you will be getting one VM running where the host is only a guest with special privileges known as root partition.
Though ideally something like SR-IOV would come into play and they hypervisor is just scheduling shared compute. Of course there is theory and reality and reality is unikernels never really caught on for many reasons while the normal stack just got optimized enough.
Since this looks like an intermediary layer between userspace and the host kernel (at least if I'm reading it correctly), does anyone know what its performance impact is?
IO bound tasks can be up to 10x slower using ptrace. I think using hardware acceleration gives you acceptable performance but ptrace is just a non-starter for prod.
It looks like around 2015 there was a lot of hype around the topic, but as I tried to document the state of the art, I noticed there hasn't been that much pick up yet. What's your take?
The two main advantages of unikernels are performance (reduced CPU and/or memory requirements) and security (hypervisor rather than container boundary).
It turns out that basically nobody cares about either of those. I know someone who for a while worked at a company that was trying to make money off unikernels. They ended up re-implementing a database in a unikernel, in a way that gave them a 2.5x performance improvement over the original project -- or to put it a different way, switching should allow companies to cut their hosting costs in half. Even with such a clear "win", it was still a difficult sell.
Think about how much backend code is written in interpreted languages like Python, or PHP, or Javascript, rather than in compiled languages like Go or Rust. It's just simpler to start with the simple solution and then throw money at the problem as you scale. And while performance may be one of the reasons that people are choosing Go or Rust for backends, if it were the only advantage, it's unlikely that would be compelling enough.
I suspect it was because containers sufficed and using and creating the tooling around them consumed the attention of those who might have otherwise looked at unikernels.
>I suspect it was because containers sufficed and using and creating the tooling around them consumed the attention of those who might have otherwise looked at unikernels.
I think you might be right. Right place at the right time for containers.
There seems to be a lot of guessing by the author. VMs that run on the CPU have the same performance as everyone else, except when you trap into the hypervisor, and especially on buggy CPUs. Now, there are a lot of buggy CPUs out there right now, so I won't say any more. But just imagine a world where we have a fully working io_uring (all system calls that make sense), and less buggy CPUs.
There are benefits to this if you are deploying something like a 5G microservice or some other single-purpose, small binary that takes few resources but at the end of the day if you are deploying a JVM application that is gigabytes in size it doesn't really matter how small the base image is. Same thing applies in unikernel-land.
The 1 gig limitation on clouds is not such a huge deal though as we can upload your image, however small it is as that size and then tell the cloud to provision the disk the size you need which acts kind of like a sparse file (but technically not one).
I want a unikernel that runs as a process with no special privileges. Huge bonus points if its portable to many common operating systems.
Since recompilation is necessary anyway for unikernels, syscalls could be replaced by function calls or some other user mode thing instead of trapped. It would allow entire containers to run as processes. Not that interesting for cloud, but very interesting for distribution to endpoints or self-hostable apps.
Counterpoint: the absence of logging, monitoring, host-based IDS, and all the system engineering tools you have on Linux is a big negative for security.
You can compile in these metric collection facilities into the unikernel if you need them. The whole point of unikernel is to allow you to mix and match only the things you need.
You can log and send metrics from your app over a network (which you should probably be doing instead of writing on disk), monitor a VM, IDS makes no sense when you don't have logins and such. And the tools are rarely installed on "cattle" anyway.
I'm not sure anyone would advocate the application to monitor itself. Many companies have entire teams of people that have to deal with keeping machines up and they get paid big bucks to do so.
As for the IDS question/statement - can you explain in more detail? Are you talking about file integrity checks or? Unikernels don't have the concept of users or shells or remote login or many of the things that an IDS would actually be looking at.
If it was something such as an attacker overwriting a shared library and you want to monitor or ensure that can't happen both of those operations are feasible in unikernels.
File integrity checks are from the 1990ies. There are various domain-specific HIDS, most of them closed source, that observe the runtime behavior of applications.
Also a lot of hardware and VM management software that perform remote administration functions, e.g. asset tracking, reacting to low batteries on UPSes, monitoring network health...
It's absurd to think that a whole OS worth of code should be jammed into the application or the unikernel. That's what traditional kernels are for.
The lightweight-ness argument is enticing, but I'm wondering if the fact that now you have to give these VMs enough RAM to run the app won't mean that you end up with worse flatpacking in terms of RAM than if you used containers?
It depends. A t2.nano has 512mb of memory. If you are using Go or something like that you could go much lower but any runtime environment such as ruby/python/node are going to want a minimum of a few hundred meg. If you are using the JVM you most definitely are going to want an instance with much more memory.
At the end of the day your application decides how much memory it wants and the sysadmin/SRE/devops person just ensures it has enough so it doesn't crash.
If you are hosting your own workloads and those workloads only need tens of megs of ram than you can pack as much as your hypervisor can handle.
Alfred talks about booting 110,000 vms on one host before memory exhaustion:
kinda. if the guest uses something like virtio balloon the sharing isn't any worse. handling exhaustion isn't any better - although I think transparent migration of vms is more of a standard function than for containers, so that's some up I guess.
>Why would you spin up a full kernel to run a single application
You're misunderstanding the levels of abstraction probably because the word "kernel" within "unikernel" is throwing you off. The idea is to use a partial kernel (only the minimum services one needs). The so-called "kernel" is a library of code where you compile the minimum bits into the single-process image.
>Operating systems already solve this problem relatively well, without the overhead, via processes
A full operating system like Linux is expending extra overhead to schedule/prioritize/monitor processes (plural) -- because Linux is designed to be more general purpose and open-ended than a specialized unikernel. In contrast, a unikernel with only 1 singular process (say a specialized db engine) doesn't need to expend extra cpu on processe(s) scheduling.
All that said, it doesn't seem like unikernels have enough advantages to attract widespread adoption like containers.
Yes. Think of it like depending on a small kernel directly in your build step. So your application gets compiled with everything (including OS interface) that it needs and nothing more. The result is a bootable image that is only capable of running your app.
I think the value isn't in the containerization vs unikernel comparison. If you're using containerization you've accepted certain security risks. Where unikernels have a lot of potential IMO is in high security environments where the security risks of containerization are not acceptable.
This is very very common misconception. Just yesterday I was helping someone out with a networking issue they were having on AWS stemming from this concept.
Coming from k8s/firecracker it is common to think that you need to orchestrate your unikernels with a framework of some kind. In our case (Nanos/OPS) a lot of people think that means spinning up an ec2 linux, sshing in and using 'ops run' on top of that but that is never suggested for prod deploys. Instead we suggest doing an 'image create' followed by an 'instance create'.
What does this mean? Essentially every time you hit the deploy button a new ami is made and a brand new ec2 instance spins up without any linux inside. So instead of adding layers through containers we actually subtract them. That means you can still configure the instance to your hearts content but you don't have to manage it - the cloud does for you and this is a huge win for many teams that don't want to deal with all the ops/SRE work that something like k8s brings (or even normal vanilla linux does).
It is important to realize that containers extract heavy performance penalties when running on top of existing infrastructure (like the cloud) since they duplicate storage and networking layers. They also have severe security issues - the shared kernel being the main one.
What is the functional difference of os=>hypervisor=>unikernel vm vs. os=>capabilities and pledges or containers? I would get if we use a unikernel approach running on bare metal for high security, specialised applications but this doesn't seem to exist?
The difference is that the vast majority of people are deploying to the cloud so they are already deploying to a hypervisor. Every single cloud is built on top of virtualization. AWS used to use Xen, now they use KVM. Google Cloud is entirely built on KVM. Azure uses Hyper-V. The cloud is just an API for virtualization.
Instead of AWS (hypervisor) => linux => k8s => containers unikernels advocate for AWS (hypervisor) => unikernel and that makes them run much faster in general (we've clocked upwards of 300% req/sec for go/rust webservers on AWS for instance) and a lot safer.
Nanos supports multiple threads but not multiple processes so you can have as much performance as you have underlying hardware but if you are using something like an interpreted language where is normal to spin up X app-workers behind a reverse proxy those become vms. (I should point out that those languages are single-thread/single-process to begin with.)
Microkernels are a much more viable way of solving security problems in an operating system. Windows and Linux could both be rewritten as microkernels within a couple of months or years.
Windows and Darwin (MacOS) were originally designed to be hybrid kernels, but compromised by allowing more and more stuff into kernel space until they were the monolithic kernels we know today. Changing code built up over 20-30 years while maintaining compatibility, security, and performance guarantees is not something which could be accomplished in a couple of months.
Early versions of Windows NT WAS a 100% honest microkernel OS. Microsoft abandoned that approach when they realized they had zero chance of being competitive with Unix with a microkernel architecture.
Darwin was never intended to be anything except what it is, which is a monolithic kernel. The XNU kernel was based on FreeBSD kernel and Mach kernels. Some versions of the Mach kernel were microkernel, but many were not.
Both NT and XNU incorporate message passing features from microkernels, but they are monolithic in that they are essentially a single large process.
"Hybrid kernel" is more of a marketing thing than an engineering term.
Microkernels are a dead-end and never stopped being a dead end. It's a lovely idea that didn't work out. They had limited commercial success in embedded systems, but only because those embedded systems didn't actually do very much and what they did was largely not performance critical.
> Microkernels are a dead-end and never stopped being a dead end. It's a lovely idea that didn't work out. They had limited commercial success in embedded systems, but only because those embedded systems didn't actually do very much and what they did was largely not performance critical.
VMware's vmkernel? VFIO/Kernel-bypass? Shoving things into kernel space does not guarantee good performance by any means, and it murders security.
The Hurd is actually older than Linux now at a ripe age of 32, although I think it was Mendel Rosenblaum that said "hypervisors/machine monitors were microkernels done right".
Modern hypervisors are more or less modeled on microkernels, not the other way around. Microkernels are just naturally a good fit for hypervisors, hence why nearly all commercial microkernels have an embedded hypervisor solution.
That said, there are options for running unikernels as K8s workloads if you want, eg NanoVMs: https://docs.ops.city/ops/k8s