Hacker News new | ask | show | jobs
by sltkr 818 days ago
No, because you can share common libraries across containers by putting them in a separate layer.

For example, if you have a complex service that consists of multiple binaries all written in C++ using boost, then for each binary you can create a container that contains a layer of a base OS (shared), C++ libraries (shared), boost libraries (shared), application binary (unique).

All the services can now share their common libraries, both on disk and in memory, which reduces I/O and memory use. That's one of the main advantages of containers over virtual machines (VMs): each VM instance has a distinct region of memory that is not shared with others even if they happen to load bit-for-bit identical binaries into memory.

(I know, VM memory deduplication exists to ameliorate this problem, but here my previous comment applies: it's much easier to start from shared components and link them together than extract the shared data after the fact. And typically VMs have lots of nonsharable state that containers do share, like pretty much all writable kernel pages.)

2 comments

Are you saying that two containers running the same image will share their common libraries in the host kernel's memory?

Based on my understanding of cgroups, that seems unintuitive to me. Are you certain that's the case? I may try testing this out when I get a chance.

Yes. And even containers running different images will share the libraries so long as they come from shared layers.
I guess thinking about it more, that does check out. The kernel loads shared libraries, and containers share the kernel.
> All the services can now share their common libraries, both on disk and in memory, which reduces I/O and memory use.

Wow, how is this possible using layers? How does docker handle it if I subsequently modify one of the files of my layer in my container?

A docker image consists of several layers, each of which contains only the modifications to the layers below it. Each layer and the final image is immutable. Docker uses OverlayFS to provide a unified view of the various layers.

A running container is based on an immutable image and a single writable layer. That writable layer is unique to the container which contains all modifications made to the immutable image by processes running in the container.

Docker relies on the immutability of layers to share them between containers. This is not much different from how regular Linux processes all share the readonly contents of binaries and libraries that they load, while each process has its own private heap space that is not shared with other processes.

That means that deleting a file from a base layer, either when building an image or at runtime from the container, doesn't actually modify the contents of that layer. It only adds a tombstone marker to the writable layer, that indicates the file was deleted, and OverlayFS creates the illusion that the file no longer exists inside that container.

(The flipside is that deleting files from immutable layers doesn't actually free up space because the actual file contents don't go anywhere, but that's rarely a problem.)