Hacker News new | ask | show | jobs
by TheCapeGreek 43 days ago
Somewhat adjacent in how I look at using Docker at all in prod, here's what I always wonder:

Is using Docker/Compose "just" as the layer for installing & managing runtime environment and services correct? Especially for languages like PHP?

I.e. am I holding it wrong if I run my "build" processes (npm, composer, etc) on the server at deploy time same as I would without containers? In that sense Docker Composer becomes more like Ansible for me - the tool I use to build the environment, not the entire app.

For the purpose of my question, let's assume I'm building normal CRUD services that can go a little tall or a little wide on servers without caring about hyper scale.

3 comments

> if I run my "build" processes (npm, composer, etc) on the server at deploy time

It's perfectly fine, as long as you accept the risks and downsides. Your IP can get ratelimited for Docker Hub. The build process can exhaust resources on the host. Your server probably needs access to internal dev dependencies repository, thus, needs credentials it would not need otherwise. Many small things like that. The advantage is simplicity, and it's often worth the risk.

> IP ratelimited for Docker Hub

How? What I'm describing is using Docker less.

> The build process can exhaust resources on the host

Maybe, but I've yet to have a host where that's the case for usual CRUD fare.

> The advantage is simplicity, and it's often worth the risk.

That's basically what I'm evaluating for here.

For bog standard LAMP or similar stack applications, I've not understood the advantage of going through the build-image-then-pull-on-host rigmarole. There's more layers involved there than something like provisioning with Ansible and just having a deploy script to run the usual suspects.

But I have seen that done fairly often, hence was wondering what the point was.

I would say it's bad practice because you end up having to copy all the build dependencies (source code) to the host and you're potentially putting a bunch of extra load on the host during the build process.

Also adds moving parts to your deploy which increases risk/introduces more failure modes.

Couple things that come to mind

- disk space exhaustion during build

- I/o exhaustion esp with package managers that have lots of small files (npm)

However, on the small/hobby end I don't think it's a huge concern.

> you end up having to copy all the build dependencies (source code) to the host

> disk, i/o exhaustion

This is why I mentioned specifically for ecosystems like PHP, which are interpreted. I'm specifically asking for that use case.

I'm not building binaries, my "build" steps are actually deployment steps (npm build, composer install, etc) that I'd be running in exactly the same way on the host. The image I'm deploying by definition also contains my source code because I'm not deploying anything compiled.

>I'm specifically asking for that use case.

That's what I answered for.

>I'm not building binaries

If you were, I would have added CPU to the list.

>my "build" steps are actually deployment steps (npm build, composer install, etc)

No, those are build steps. If you weren't using Docker, you would either run all those and shove in a zip/tarball or package into a deb/rpm, etc

>The image I'm deploying by definition also contains my source code

It doesn't contain .git or need credentials to your git/SCM

>I'm not seeing the benefit of the whole "build image, pull on server" pipeline when I can just ditch the registry and added layers by doing those steps on the server as I would normally in other kinds of scenarios

You don't need a registry--you can Docker save/load to push images directly to the server. Images buy you a versioned artifact with all the code-level dependencies baked in. Some maintainer yanks their package from npm? Who cares--you have a copy in your Docker image. Your new app version doesn't work? Edit 1 line to point back to the old image tag and rollback.

>> The build process can exhaust resources on the host

>Maybe, but I've yet to have a host where that's the case for usual CRUD fare.

When the build process completes, it tears down the overlayfs which causes everything to sync which leads to a big I/O spike. Depending on the server and amount of files, it might have no impact. However, I've seen build servers become completely unresponsive for 5+ minutes due to the I/O load when this happens. One place I worked, we had to switch our build servers to NVMe--the Docker container teardown caused spikes over 100k IOPs. Can't remember the exact details--it was React either React web front end or React Native mobile app.

>There's more layers involved there than something like provisioning with Ansible and just having a deploy script to run the usual suspects.

`docker save myimage:tag | gzip | ssh user@server 'gunzip | docker load'`

Not saying creating distributable artifacts is the de-facto answer, but I'd strongly consider whether it's really that much more complicated.

> Images buy you a versioned artifact with all the code-level dependencies baked in.

Fair enough, that buys a little bit of time to not break deployments I supose.

> When the build process completes, it tears down the overlayfs

Ah okay, I misunderstood you then - I was referring to Docker-less servers and my build steps running there, not building the images on the machine.

Thanks for the info!

Have a look at multi stage container builds. Your images should not need a build step at start, the result should be in the baked image. Else you become reliant on fetching packages during build etc.
I guess what I'm asking for is what the point is of a "baked" image for interpreted language ecosystems. Already using multi stage builds.

"Builds" are the same as deploys, so when working with server(s) instead of larger scale deployments, I'm not seeing the benefit of the whole "build image, pull on server" pipeline when I can just ditch the registry and added layers by doing those steps on the server as I would normally in other kinds of scenarios.

But I have seen this in action, which is why I'm wondering if I'm missing something.

The clearer benefit to me seems to be in this scenario to use it as a fast environment provisioning tool.