Hacker News new | ask | show | jobs
by encryptluks2 1832 days ago
Why not run containers in VMs in containers in VMs? :)

Seriously, VMs are hardly as secure as many people want to believe unless you're utilizing enclaves and even that has vulnerabilities. I think a better approach is Seccomp and whatever other filtering makes sense.

5 comments

A while back I did some looking at FreeBSD jails to try to figure out why they don't have more mindshare (especially when paired with the nigh-superpower-granting ZFS).

I came away baffled that they weren't more widely-promoted, compared with Docker and friends. After thinking about it for a while, all I can figure is they're so straightforward to use and well-documented that there's no room to make one's name, or to make a buck, re-packaging them or wrapping them in complex tools, so there's little money or glory (= personal marketing via open-source project leadership/contributions) in promoting them.

[EDIT] that is: what would be a blog post in LXC/Docker land... doesn't exist, because it's covered perfectly well in the docs. What would be a simple open-source tool... becomes a blog post, because it's short, simple, and clear enough not to merit special software, but just a quick guide to existing tools. What would be a business, becomes a simple open-source tool without enough of a difficulty/convenience "moat" to support a business.

I suspect the answer includes it not being Linux, even with the compatibility layer available.
I'm sure that's some of it, but the trend seems to be moving away from leveraging OS-level tools anyway. As long as your containers (or jails) and the single important binary in each one start up OK and your network tuning on the parent OS isn't completely screwed up, the rest barely matters anymore.
It seems like you're missing a lot of things.

As a developer, how do I run FreeBSD Jails on my MacBook during development? With Docker for Mac, it is trivial for me to do everything on my Mac, and the fact that there is a virtual machine is completely invisible to me. Everything "Just Works". With FreeBSD Jails, I would have to actually interact with a VM constantly, including the pain of shipping files back and forth.

As a developer, are popular databases and applications pre-packaged as FreeBSD Jails so that I can spin one up on my laptop with a single command? Where is the Docker Hub equivalent?

As a developer, how do I orchestrate a collection of FreeBSD Jails for each project? With Docker, I define a single `docker-compose.yml` file for each project. With a single `docker-compose up`, the entire project is running including dependencies such as databases and other related projects in a completely reproducible fashion. This makes it trivial for coworkers to spin up a project on their machine and immediately be productive without spending an hour trying to get all the right versions of everything installed and up and running.

As someone responsible for deploying an application to production, what is the story around FreeBSD Jails for deploying across a cluster? Is there a Kubernetes-equivalent that can manage the allocation of resources, blue-green deployments, and manage the lifecycle of my FreeBSD Jails?

As someone responsible for deploying an application to production, do any of the major clouds support FreeBSD Jails? With Docker images, I can deploy those straight to ECS Fargate, Google Cloud Run, and half a dozen other services. Then I don't even have to think about my own infrastructure unless I need some really specialized hardware for a specific application.

> the rest barely matters anymore.

Everything else matters so much.

As to your earlier point about ZFS, most Linux distros these days seem to trivially support ZFS. Even TrueNAS is working on switching to Linux with their TrueNAS Scale offering.

It's not that I'm opposed to FreeBSD... FreeBSD is just a hard sell. It's hard to pin down exactly what you're gaining by throwing out all the collective Linux knowledge of an organization and switching to FreeBSD. FreeBSD is an N-th tier platform for pretty much every programming language except C, so good luck when you run into random subtle problems. Also, good luck doing hardware accelerated machine learning inference or training on FreeBSD... it's probably possible?

> the single important binary

This is also such a weird thing to throw out there. I like a good Go program myself, but most companies are not only deploying single-binary statically linked applications. Most companies are also deploying some kind of Ruby, Python, or Java application... none of which are likely to be a single file in practice. Most of them will have a variety of shared libraries, and I don't know if I've ever seen a Ruby application shipped in a `FROM scratch` container before. Technically possible, but that's just not common reality as far as I've seen. It sounds like you're proposing that everyone is already running in `FROM scratch` containers, so a FreeBSD Jail is just a drop-in replacement.

Linux containers are far from perfect, but as a developer... I have played with FreeBSD Jails before, and come away frustrated by all the work you have to do yourself.

> As a developer, are popular databases and applications pre-packaged as FreeBSD Jails so that I can spin one up on my laptop with a single command?

The closest you can get is BastilleBSD (framework for FreeBSD Jails) and their templates - available here:

https://github.com/BastilleBSD/templates https://bastillebsd.org/templates/

> > the single important binary

> This is also such a weird thing to throw out there. I like a good Go program myself, but most companies are not only deploying single-binary statically linked applications. Most companies are also deploying some kind of Ruby, Python, or Java application... none of which are likely to be a single file in practice.

Sure, but usual practice with containers is to put each thing in its own, unless they are very tightly coupled. Web-app with a SQL database and a memory cache? Three containers. You can do otherwise, but that's typical. Usually each container ends up with one main, important running process, and not much else.

[EDIT]

> As someone responsible for deploying an application to production, what is the story around FreeBSD Jails for deploying across a cluster? Is there a Kubernetes-equivalent that can manage the allocation of resources, blue-green deployments, and manage the lifecycle of my FreeBSD Jails?

> As someone responsible for deploying an application to production, do any of the major clouds support FreeBSD Jails? With Docker images, I can deploy those straight to ECS Fargate, Google Cloud Run, and half a dozen other services. Then I don't even have to think about my own infrastructure unless I need some really specialized hardware for a specific application.

These are exactly the kinds of things I was thinking of when I noted that the OS itself has been seriously diminished in importance, for modern workflows. I agree that most commercial or high-profile open-source "cloud" tools and platforms are built around LXC/Docker.

> Sure, but usual practice with containers is to put each thing in its own, unless they are very tightly coupled. Web-app with a SQL database and a memory cache? Three containers. You can do otherwise, but that's typical. Usually each container ends up with one main, important running process, and not much else.

I agree, but... getting all the application dependencies in there is more than just getting a single binary in there. If it's just a single-binary Go program, then a Jail works just fine, but it's not that simple for a Ruby application. I'm definitely not talking about databases running in the same container as the application. That's where Kubernetes and docker-compose come in for multi-container orchestration, which are things that FreeBSD Jails don't have as far as I know.

> These are exactly the kinds of things I was thinking of when I noted that the OS itself has been seriously diminished in importance

Yes, but... these are all the things that FreeBSD doesn't offer. These are the real reasons that people don't talk about FreeBSD Jails in the same breath as Docker. The Docker container itself (or the FreeBSD Jail) as a unit of isolation is the least interesting part of the ecosystem. All of the developer tools, orchestration tools, and prebuilt images are what make the Docker universe so interesting, and make FreeBSD Jails... less interesting.

You said you were confused why Jails don't have more mindshare. It has absolutely nothing to do with people being able to invent useless tools and write blog posts about them, and it has absolutely nothing to do with FreeBSD Jails being too well documented. You kind of implied those were the best explanations you could come up with. Those are not the problems at all, and it seems disingenuous to me to say you think those are the problems unless you really didn't know the things I mentioned in my first reply.

If technically best in the container space mattered, Illumos would be everywhere...
People say this a lot too, but Illumos also uses shared-kernel isolation. Linux + gVisor is probably (significantly) superior to it as far as security goes.
90%+ of Docker users aren't using gVisor; I don't disagree that it's good, but it feels like an aside.
Or z/OS
Jails are still shared-kernel isolation. Docker's reputation is mired in its earlier implementations, when it wasn't really even intended for multitenant isolation. Modern Docker, running with unprivileged containers (which is the norm), is substantially hardened. The real win over Docker is losing the shared kernel, which is what lots of people are doing, so the win to Jails is marginal.
TrueNAS exposed me to FreeBSD jails but what put me off is that there does not seem to be an equivalent of "docker build".

Jails seem to be treated like OpenVZ containers in the Linux world: a lighter alternative to virtual machines, not a way to build and distribute applications like Docker.

This is just my take after playing a few hours with jails, I would happily be proven wrong.

Heretics! Vicitimizing all the Fashionistas! Where would be the fun of endless shiny new things? The thrill of employing l33t google skillz to find just another solution to cut&paste in haste, with no wasted time reading boring old style manuals and documentation. Attention deficit is the hottest shit! Deal with it!
I don't know what people generally believe.

But the attack surface of a Linux kernel is very large, is pretty unpredictable, and can't be coherently masked out with rules (my favorite example Jann Horn's VM reference count bug, which was a simple concurrency flaw in the core virtual memory system). By comparison, a Linux KVM hypervisor is not just a subset of the kernel by definition, but also a much smaller codebase, a tiny fraction of the whole kernel.

Replacing shared-kernel isolation like seccomp-filtered containers with VMs is, architecturally, simply the replacement of a large trusted computing base with a smaller one. If the overhead is acceptable, it's hard to argue with from a security perspective.

OK; https://github.com/harvester/harvester

Security and performance aren't the only driving forces; there are a lot of technical and operational benefits to the abstraction and standard interfaces that you get when running stacks that might otherwise look like someone took an Xzibit meme too far.

Also remember on a modern system, there are often at least 2 additional layers at work abstracting interfaces to the "bare metal" OS already.

I'm not disagreeing that abstraction can be useful, but the overhead of a VM is unnecessary if utilizing the full potential of containers. Afterall, the Linux Kernel is acting as the hypervisor already, so might as well trust it enough to properly sandbox containers too and use the right functionality to do so. I also think that running a virtualization layer adds quite a bit of complexity, so while it is cool that projects and companies have made it work and integrated it with a container solution, eliminating the VM layer altogether seems more ideal IMO.
That's the approach taken by Google's gVisor (at the cost of I/O and network performance).
No, that's really not at all what gVisor is. gVisor is best thought of as user-mode Linux --- a complete reimplementation of most of the OS kernel. It's not a system call filter; it's something much closer to a VM than to seccomp.

gVisor is a very cool codebase. As an illustration of the approach: it includes its own TCP/IP stack; we use it in our command-line dev tool to allow people to SSH to their VMs over WireGuard without having to install WireGuard or obtain privileges to manage WireGuard.

gVisor, for better or for worse, does a whole lot of other things than just seccomp filtering, and it shows in performance tests.
gVisor does more than filtering, they basically reimplemented the syscalls in an application kernel. At least with seccomp the performance overhead is minimal.
How does gVisor fair against KVM and other hardware-accelerated VM solutions (firecracker)?
Machine Turducken.