Hacker News new | ask | show | jobs
by choeger 2040 days ago
Maybe you are just too far into docker. I noticed that a lot of default workflows (needlessly) depended on docker running with privileges. One big reason for that seem to be Mac users that only know docker from inside a VM. However, if you think about what you're really needing for CI you will easily see that docker-in-docker gains you nothing. You can as well use plain docker (or podman). The same holds for privileges. No CI operation should need privileges, if only for the reason that it should never alter the CI system itself.

I encourage you to not take the standard workflows as a given and really think about what you need and I bet you either end up with a use case that can be covered by rootless podman or something that requires real VMs anyways.

4 comments

Okay, I want to build a container image using gitlab CI, which runs builds in docker. How would you like me to build an image without using docker in docker, or buildah in docker?
We use kaniko[1] in Gitlab CI and it’s working great for us. It’s annoying the kaniko image requires us to specify the entrypoint. There’s some peculiarities with Dive [2], but otherwise it’s been a very easy migration.

[1] https://github.com/GoogleContainerTools/kaniko

[2] https://github.com/wagoodman/dive/issues/318

I tried kaniko and didn't like it.

Having build and push as part of the same job is frustrating and I view it as a sign that a CI system is built with the expectation of having everything happen post commit by shoveling money into a CI auto-scaler. I know there's `--no-push`, but that's a poor substitute for independent `build`, `tag`, `push` build steps IMO.

Do you have any way of running / debugging locally with GitLab CI plus kaniko? Can you run your build pipeline locally on your workstation against uncommitted code?

IMO I can build a way better local workflow that allows me to run builds _before_ committing with Drone (`drone exec`). I can toggle between a locally bound Docker daemon or a DIND environment that's going to be virtually identical to the DIND environment on a build runner. The `push` step doesn't run locally plus the secrets needed to push are only accessible from an official runner. I can run it on Windows or Linux (and likely Mac) too.

I've been trying to find a self-hosted CI system that's really good at building Docker (or OCI) images and I don't think any exist. They all have short-comings. Having build, tag, push act like an atomic build step is one of the areas where I think most fail. So many claim to enable repeatable builds but none actually do AFAIK. Whoever writes the Dockerfile needs to know a lot about how images are built to have the slightest chance at creating a repeatable build. A great example is having `apt-get update` in a Dockerfile. That command _always_ returns zero, so by itself it makes builds non-repeatable.

Sometimes I have a tough time reconciling the development industry because things just don't make sense to me. I remember people complaining about Gradle start times so much they came up with the Gradle daemon. Now no one bats an eye at CI based build systems where you have to commit your code, wait for a runner to get provisioned, wait for Docker or the OCI runtime to spin up, and wait for your project to actually build on some anemic VM.

People used to complain about seconds because the wait was "too slow" for good local iteration, but now waiting for minutes is a "good" build system. Seriously WTF?

I guess I got on a bit of a rant...

> I remember people complaining about Gradle start times so much they came up with the Gradle daemon. Now no one bats an eye at CI based build systems where you have to commit your code, wait for a runner to get provisioned, wait for Docker or the OCI runtime to spin up, and wait for your project to actually build on some anemic VM.

I want this framed or sewn onto a pillow or something.

It's amazing what we can build, it's baffling what we have built.

> Having build and push as part of the same job is frustrating and I view it as a sign that a CI system is built with the expectation of having everything happen post commit by shoveling money into a CI auto-scaler

No, the idea is to build and push images you can test directly afterwards in the same conditions. With cache and such, build times shouldn't be too long

Gitlab CI just runs shell commands, it's pretty trivial to pull the same image its using in the job, and run the same commands locally.

If you have long CI times, that can hinder development productivity and should be improved as much as possible, or a local replica of the CI needs to be created.

> Gitlab CI just runs shell commands, it's pretty trivial to pull the same image its using in the job, and run the same commands locally.

In fact, they already provide a utility for doing this: gitlab-runner exec[0].

[0]: https://docs.gitlab.com/runner/commands/#gitlab-runner-exec

It is really working for you? Presumably you setup kaniko to build the images first that then run the second part of your pipeline. To run the images, you need to tag them with something that you use inside the gitlab-ci.yml, right?

Now what happens when two people push code that make changes to the containers at the same time?

GitLab CI exposes the pipeline ID and it would work for that.
kaniko is really really slow:

    - https://github.com/GoogleContainerTools/kaniko/issues/1392
    - https://github.com/GoogleContainerTools/kaniko/issues/970
    - https://github.com/GoogleContainerTools/kaniko/issues/875
I've observed basically the same stuff.
Your problem is the assumption that a gitlab CI process requires a docker image. If you do the following, you are fine:

* Switch to a shell runner

* Put the CI dockerfile into your repo

* Provide an entry script for CI that builds the container on-demand (and manages caching/cleanup) and then runs the tests/whatever inside that container

The point here is that docker/podman provide you with everything you need as long as you have full control. By using gitlab's default CI, you relinquish this control.

But how would you run integration tests between multiple docker containers? We launch our services in these docker containers and do some integration / e2e tests, and this very much requires Gitlab CI to launch docker containers while inside a Docker container.

After using dind for some time we chose to just mount /var/run/docker.sock and keep using the host machine’s docker instance (mostly for the cache), but all in all dind was working fairly well.

To be honest, to say “you shouldn’t be doing that” is missing the point; one should be able to do anything they want. In my opinion, the root cause here is docker’s client/server model, which is fixed by other container runtimes such as podman and rkt (which unfortunately is deprecated). One should be able to just launch containers as if they were just another process.

To the contrary. You should never expect to be able to do what you want with a system provided by someone else. Gitlab has its design decisions in their default setup and you need to take control to cater your use case. This herd mentality of "everyone is doing it this way" is really fundamentally problematic.
Nowhere in my post am I saying "everyone is doing it this way".

You started out with asserting "you're too far into Docker", we bring up valid use cases for docker-in-docker, and then you saying "This herd mentality [..] is really fundamentally problematic" is really not adding a lot to the discussion.

No one brought up valid use cases for docker-in-docker. They brought up the issue that gitlab mandates docker as an interface (which I totally understand, btw.).

For instance the "how do I compose multiple docker containers" is trivial when you can just execute a script that runs docker or podman. If you really want, you can use docker-compose.

AFAIU you can't switch to a shell runner unless you self host gitlab runner.
That is exactly my point. Gitlab uses docker runners because it is much simpler for them. But why should you be constrained by what's simpler for gitlab?
Well, it's free to use GitLab's hosted runners. You have to pay for your own build server if you want to host your own...
Kaniko is a choice: https://github.com/GoogleContainerTools/kaniko

Userland docker builds

https://github.com/GoogleContainerTools/kaniko can do it but I just use podman / buildah with storage driver as vfs, and then neither requires root / high privileges.

I'm not sure why it doesn't do this by default. Performance I guess.

Our org just adopted Makisu[1] which works beautifully in our GitLab CI pipelines. After a week of setup and fiddling with the settings, we were able to migrate all our build jobs to Makisu and haven’t looked back. Build times are great too, especially if you set it up with a Redis cache.

[1]: https://github.com/uber/makisu

Have you used / considered kaniko before? If so, what are the advantages of Makisu?
Why not simply use that then?

We're using podman in containers inside gitlab ci.

We're _also_ running tests of containers inside containers in containers using gitlab-ci.

The main workaround we've applied is using crun as the runtime rather than runc.

Kaniko :) no DIND weirdness or privileged containers necessary
Lots of folks recommending kaniko, which we use. But it seems quite slow compared to docker build. Is this not an issue for others?
A container image is just an archive file with some metadata, could you take the same approach as Google Jib?
Why do you say that docker-in-docker buys him nothing? It's not obvious at all and you go into no detail whatsoever to back up your opinion.

In my experience, that is not true at all. Docker-in-docker allows me to deliver smaller images that can fit into a CI flow as language plugins instead of shipping a beastly 5G docker image with every possible language runtime I need to support for my CI tool.

It is because to build the image using docker requires the docker client to talk with a dockerd daemon, so one has to configure the client to access the dockerd which allow untrusted code to run as root in the host.

Docker-in-docker is a workaround to make docker work in CI.

Basically a security nightmare and bad design that podman doesn't have.

Any build script can do serious damage to the environment it runs in. Before docker, you'd have to create a new VM from time to time because the build agent had rotted away or died in an altercation with a bad build.

Docker in Docker in CI is like a lock on a door. It keeps honest people from being naughty, and is fairly efficient about it.

I don't think the question is "should I run CI in docker in docker," it's whose CI should I run in docker in docker. Me in my coworkers can share docker images. Customers or freeloaders cannot. So if that's in your problem domain, then you're right, it's a bad idea. But it isn't for most people.

You do know that spinning up a new VM only takes a few seconds? With projects like https://firecracker-microvm.github.io/, the difference between launching a new Docker container or a new VM is negligible.

This works great if you own or rent the hardware, but most cloud providers don't allow nested virtualization.

The cost is not spinning up the vm, it’s maintaining the images. Docker composability reduces the combinatorics problem to a dull roar, and democratizes some of the maintenance effort. You want an image with the bug fix from the latest point release of python? And you need it by noon? Knock yourself out.

Although there are tools to convert docker images to vm images. I expect if I were running community CI infrastructure, getting really familiar with those would be high on my priority list.

The other option that works really well in a single user environment is to bind to the runner's Docker daemon. That way builds run as siblings of the runner's daemon rather than as children via docker-in-docker.

The huge issue with that is security which is why it's only really practical for a single user or a small group of trusted users. A secondary issue is that (I think) builds can't run simultaneously because they can trample each other when tagging images (since all images are on the runner's daemon).

If I had to build a Docker focused CI system I'd think about using Weave Ignite (AWS Firecracker) to spin up VMs for runners with the Docker socket bound like described above. That way you get all the convenience of binding the Docker socket, but the isolation of a VM that gets thrown away after the build step (or pipeline) finishes. That idea also fits well with local running / debugging IMO because you can bind to the Docker socket on your development workstation (assuming you're not running a large build of parallel tasks which might be an unrealistic assumption).

For us it’s a matter of the CI tool fetching the source code for the docker image, then running docker build, and not necessarily immediately. So you have ‘docker build’ happening toward the end of a set of other tasks. Which I’d really like to have running on a fresh VM or container.

You could separate those into two builds, but the reason they are together is so people think about deployment, and in case any structural changes to the code need to coincide with deployment changes. For instance, breaking changes in APIs. I need a new version of tool/library and I need to change how I call it.

Kubernetes as a layer of indirection is another solution.
Kubernetes is a consistent management api for linux (so then you don't need to interact with iptables, mount and all that "hard stuff").

I don't think kubernetes is a solution for context of building an image (a rootfs tree into a .tar.gz file).

Unless you are using kaniko which extends the kubernetes api to add the capability of creating images, but that is handled by kaniko itself via the same api.

I was probably unclear. Kubernetes is a good solution for managing containers (obviously). I use it for CI and it works very, very well, though the CI tools still have more features they could add with the integration.
> beastly 5G docker image

my beastly 12GB image that even includes Matlab wants a word with you

>> beastly 5G docker image > my beastly 12GB image that even includes Matlab wants a word with you

Perhaps in the next 10 years we will be rediscovering packages. :P

If you are in the business of charging complex prices per bits over the network, then docker seems to be quite a good investment and making it as popular as possible is a good strategy to print money. /s

> If you are in the business of charging complex prices per bits over the network, then docker seems to be quite a good investment and making it as popular as possible is a good strategy to print money. /s

True, that.

To be fair, at least it allows me to avoid lots of the brokenness of Python packaging.

Is always good to report packaging bugs so then people can fix them, do you have examples of python packages that can be improved?
See my previous comments.

tl;dr pip silently breaks my environments, mostly connected to upgrading numpy and other scientific/data science libraries.

Well, to be fair, it is packages - I'm just using Docker (for this section of our stack) as a different sort of VM, essentially. It runs a service manager and a VNC X session, for chrissakes ;)
Our images are 35GB, and I've spent much of the last two weeks breaking up files so we don't hit the 8GB per file limit, and my next week will be trying to avoid hitting the per-layer limit.
I think the ideal is using Nix to manage development dependencies and to handle building minimal docker images for deployment.
Why not just use nix at that point? You can at least retain the advantage of having truly immutable and reproducible builds.

I use docker for most of my clients' work but for in house stuff I just use nix.

Docker and Nix are mostly orthogonal technologies though. Nix is an excellent build tool while Docker is useful as an universal software distribution format. They're really useful together because Nix is actually good at creating compact Docker images. But of course, if you don't have to worry much about distribution, it would be much nicer to stick with Nix as you've mentioned.
Well, just nix works pretty well unless you want to deploy to K8s or Fargate or similar.
>One big reason for that seem to be Mac users that only know docker from inside a VM

That has not been the case for a good while now... Docker has been running directly on a hypervisor on the Mac.

Honest question, what's the difference between "from inside a VM" and "directly on a hypervisor"? I always thought it meant the same thing.
Docker for Mac is still running a VM via LinuxKit/HyperKit, but there's a big difference in overhead between LinuxKit and running VMWare/VirtualBox.
Well, in the case of Mac, basically performance, as you still need linux libs, kernel API and everything. But it's much lighter-weight than something like Virtualbox, VMWare, etc.

In Linux iirc, a hypervisor can share such resources with the host system (since they are both Linuxes).