Hacker News new | ask | show | jobs
by shatteredgate 1639 days ago
And all the other ones: https://man7.org/linux/man-pages/man7/namespaces.7.html

BSD jails are similar but not quite the same thing.

1 comments

I don't get it. How are people using this flexibility to get things done in practice, and what uses aren't allowed by the jail model?
You can just compare the APIs, namespaces are like the individual components of a jail. You can use them to build something like a jail, or something different that has a different security model. This was discussed a lot in an old HN thread: https://news.ycombinator.com/item?id=13982620
This doesn't really answer the question. Yes, the Linux API seems more flexible, but when you think about it, it really isn't, because all the models that actually make any sense can be implemented using simpler interface, which is what jails provide.

One real difference is that you need to be root to create a jail. It'll get fixed eventually - FreeBSD already has unprivileged chroot, jail isn't that much different.

>Yes, the Linux API seems more flexible, but when you think about it, it really isn't, because all the models that actually make any sense can be implemented using simpler interface, which is what jails provide.

Not really, the example of Docker would probably be the most straightforward there. I don't think it's possible to fully port Docker to jails or at least I've never seen a successful port, some of the network topology features seem to just not be possible or straightforward. But I could be wrong, I have not looked into the technical details of this in years, somebody told me it might have been working a while ago but I never heard anything else about it since.

Needing to be root is a major deficiency though and I can't take jails seriously with that, one of the main focuses on Linux containers in the past several years has been to make unprivileged namespaces a good option.

Docker is literally just a jail. You can do whatever network topology you want using vnet.

And yes, having to use root is a major issue. Looks fixable though.

Sorry, then I must be remembering some other issue. The effort to port docker to BSD seems to have disappeared.

>And yes, having to use root is a major issue. Looks fixable though.

AFAIK it took a long time to get this to work on Linux, there are a lot of security issues that it can cause.

> Needing to be root is a major deficiency though

Note: Linux also needs root for its namespaces. Or at least CAP_SYS_SYSADMIN, which grants enough that it's pretty much as good as root. See setns(2) and clone(2) for details. This is one of the complaints the plan 9 people have always had with Linux namespaces.

Not anymore, unprivileged user namespaces make it so you don't have to do that. That's how podman's "rootless containers" are able to work.
Yes, I am aware that it's got more moving parts. What are you using this flexibility for?
I'm using them for several things but the most straightforward one is probably that namespacing can be gradually added to services, you most likely see benefits from this already if you use systemd. That's one way that namespaces can be used in a different way from the docker model.
What are you adding gradually, specifically? Like, a concrete example that names a namespace you may want to use. I'm trying to figure out what problems a half sandbox solves, and a vague "I just want to enable some capabilities" doesn't help here.
A lot of the various security options in systemd: https://www.freedesktop.org/software/systemd/man/systemd.exe...

The sandboxing and mount-related ones are implemented with namespaces, and the idea with them is to not make any of them mandatory so they can be slowly added to system services. That way you can get some of the benefits without needing to build a full rootfs/container for the service. I am not sure how any of those would be done with jails because jails require you to create a chroot and network interface, whereas in Linux the mount and network namespaces are just optional namespaces and you can still use the other namespaces without using them.

> How are people using this flexibility to get things done in practice

Um... to loop back to the upthread point: Docker. People are using Docker, and docker is using this stuff.

And how is it mixing and matching these APIs? Given that there's an OCI-compatible runner for jails (runj, compatible with runc -- which is what docker uses to start containers), it seems to me that Docker isn't in actually using the flexibility afforded by the APIs here, but is just using a relatively fixed set of options.

If I'm wrong: what is it using, and what problems is this flexibility solving?

I haven't tested runj but just from looking at it, it seems it is not fully compatible with everything that runc does because the OCI itself specifies a lot of Linux-specific functionality.
Can you provide some examples?
runC is literally the abstraction layer docker wrote internally on top of linux containers! It exists as a separate layer now because they spun it out precisely to freeze the API and enable other efforts like runj.

And runj, IIRC (though I'm not an expert in the space) wasn't a trivial 1:1 thing and required changes to the underlying jails layer to enable it.