Hacker News new | ask | show | jobs
by michaelmior 4537 days ago
My understanding is that the goal is not isolation, but performance. You can remove large chunks of the OS which you don't need. You also don't have any overhead from system calls since all code runs at the same privilege level. This is possible because (in theory) you can't execute arbitrary code. All the executable code is baked into the kernel at compile time and the page tables are sealed so no new code can be loaded.

You can achieve the isolation with jails and cgroups, but not the performance improvements.

2 comments

As mentioned in a sibling you still have the hypercalls, and you definitely need those to still be present if you're running at ring 0 since, essentially, direct access to the hardware is probably an opportunity to attack the whole physical system (since hardware often has arbitrary bus access). Never mind the need to arbitrate access between multiple VMs.

And this is what I mean when I say that taken to its conclusion you're just reinventing processes.

I think this kind of performance claim needs to be solidly proven by something at least vaguely like a real running application to be taken as a given.

Fair point. More benchmarks need to happen before it's obvious this is really a win. A real application would be nice. I'm biased because I worked on a similar idea myself and I've been waiting for this to come. I think it's a potential win now for running on public clouds.

However, as Docker PaaS gains popularity, that may be a better alternative. Only benchmarks will tell :)

What did you work on?
A similar yet much simpler idea of porting some simple application code to MiniOS. Although I never ended up with anything of value.
You can use I/O virtualization to allow direct hardware access in a safe fashion, assuming that your CPU and peripherals support it.
This isn't the attack I'm referring to. The peripherals themselves have, potentially at least, complete access to the bus through DMA, so being able to convince them to, say, write to an inappropriate physical address (say the hypervisor's kernel), could lead to a significant breach of the security model. As far as I know, no processor-level features actually protect against this.
You've eliminated system calls but you have hypercalls; it's not clear whether this is faster than a container-based system that has system calls but no hypercalls.
True. Of course one other advantage is that you can run a unikernel on a public cloud. You can of course run OS to serve as a host for containers on a public cloud, but then you have an additional layer of overhead.
I wish there was more experimentation on cloud architecture; VMs aren't the be-all and end-all of the cloud IMO.