| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by staticassertion 1571 days ago

It's not a binary thing. I would say something is a boundary if it requires an additional vulnerability to bypass. Containers these days fit that model. The nuance is how strong of a boundary it is.

Containers rely on the Linux kernel. The Linux kernel is shit, in terms of security, for a number of reasons. So all one requires is to own the kernel, and there are a lot of ways to do that. Containers block some system calls and can lower attack surface to a degree, which is great - I think it's a huge win that containers are so popular and, finally, some degree of isolation is widespread.

We'll be stuck in retroactive security mode until developers care to change that, especially ones with influence like kernel maintainers.

> My point is that if it were designed from the ground up with the hard security boundary in mind, would we have ended up with containers in the first place?

Absolutely not. We'd have ended up with something like Firecracker or GVisor. The issues with containers are fundamental to the concept of having a shared Linux kernel, which is basically what makes a container a container.

> If not, is there any realistic way to go from where we are to where we should be?

Use Firecracker or GVisor.

> Those have the downside of actually needing to run a VM though

I think at this point VMs are not that big of a deal. It's clearly good enough for the vast majority of people who are running on the cloud.

> don't allow nested virtualization so you're stuck running on an enormous bare metal box.

This part is a bummer.

The other option though is to just not care if your OS gets owned. Split your services up, move capabilities across other boundaries like mTLS.

1 comments

tptacek 1571 days ago

gvisor doesn't require nested virtualization, right? If you're willing to take a tenable user-mode-Linux performance hit, you should be able to run it on anything?

link

staticassertion 1571 days ago

My understanding is that gvisor supports two modes of execution - one with virtualization and one without. AFAIK the official recommendation is to use the one with virtualization, but I've never dug into it.

link

tptacek 1571 days ago

Yeah, the original mode uses ptrace to intercept system calls, and then just implements the system call itself.

link