Hacker News new | ask | show | jobs
by hello0904 638 days ago
Serious question for you, why use Docker at all? You can just get rid of the clunky overhead.

You mentioned Python backend, so literally just replicate build script, directly in VPS: "pip install requirements.txt" > python main.py" > nano /etc/systemd/system/myservice.service > systemd start myservice > Tada.

You can scale instances by just throwing those commands in a bash script (build_my_app.sh) = You're new dockerfile...install on any server in xx-xxx seconds.

9 comments

I mentioned Docker because it interests many developers but on VMs that I control I do not need Docker at all. Deploying with Docker provides host OS independence which is nice if you are distributing but unnecessary if the host is yours, running a fixed OS.

For Python backends I often deploy the code directly with a Puppet resource called VcsRepo which basically places a certain tag of a certain repo on a certain filesystem location. And I also package the systemd scripts for easy start/stop/restart. You can do this with other config management tools, via bash or by hand, depending on how many systems you manage.

What bothers me with your question is Pip :-) But perhaps that is off topic...?

No, you are tied to docker supported operating systems.

Will not run on FreeBSD, for example.

I'll correct myself:

s/host OS independence/a certain level of host OS independence

And getting containers to run depends on the OS - if you don't control the host, leads to major ping-pongs.

Even within Linux (Ubuntu, Debian, RHEL, etc) when you are distributing multiple related containers there are details to care about, not about the container itself but about the base OS configuration. It's not magic.

>Will not run on FreeBSD, for example.

Not true:

https://podman.io/docs/installation#installing-on-freebsd-14...

ATM experimental

Yes, so not really supported.
That's the lamest excuse ever, are you a tech guy or a lawyer?
OP is talking about substituting a Kubernetes setup. FreeBSD was never in the cards. 99% of companies in the cloud don’t run or care about anything other than Linux.
That may be true, but it’s still not “host OS independence”, which was my point
> No, you are tied to docker supported operating systems

No, you're tied to operating systems using a Linux kernel that supports the features necessary for running images.

You can run Linux under FreeBSD using either bhyve, using the Linux emulator and under jails. But you cannot run docker.
>But you cannot run docker.

You can -> Podmaaan

https://podman.io/docs/installation#installing-on-freebsd-14...

ATM experimental

Famously, no one has ever had Python environment problems :D
If you really want to open that can of worms, here it goes:

Pipy is an informal source of software that has low security levels and was infested with malware many times over the years. It does not provide security updates: it provides updates that might include security-related changes as well as functional changes. Whenever you update a package from there, there is a chain reaction of dependency updates that insert untested code in your product.

Due to this, I prefer to target an LTS platform (Ubuntu LTS, Debian, RHEL...) and adapt to whatever python environment exists there, enjoying the fact that I can blindly update a package due to security (ex: Django) without worrying that it will be a new version which could break my app. *

Furthermore, with Ubuntu I can get a formal contract with Canonical without changing anything on my setup, and with RHEL it comes built-in with the subscription. Last time I checked Canonical's security team was around 30pax (whereas Pipy recently hired their first security engineer). These things provide supply-chain peace of mind to whoever consumes the software, not only to who maintains it.

I really need to write an article about this.

* exceptions apply, context is king

I've just doubled down on "making my own Debian packages".

There's tons of examples, you are learning a durable skill, and 90% of the time (for personal stuff), I had to ask myself: would I really ever deploy this on something that wasn't Debian?

Boom: debian-lts + my_package-0.3.20240913

...the package itself doesn't have to be "good" or "portable", just install it, do your junk, and you don't have to worry about any complexity coming from ansible or puppet or docker.

However: docker is also super nice! FROM debian:latest ; RUN dpkg -i my_package-*.deb

...it's nearly transparent management.

I don't mean this as a rebuttal, but rather to add to the discussion. While I like the idea of getting rid of the Docker layer, every time I try to I run into things that remind me why I use Docker:

1. Not needing to run my own PPA server (not super hard, it's just a little more friction than using Docker hub or github or whatever)

2. Figuring out how to make a deb package is almost always harder in practice for real world code than building/pushing a Docker container image

3. I really hate reading/writing/maintaining systemd units. I know most of the time you can just copy/paste boilerplate from the Internet or look up the docs in the man pages. Not the end of the world, just another pain point that doesn't exist in Docker.

4. The Docker tooling is sooooo much better than the systemd/debian ecosystem. `docker logs <container>` is so much better than `sudo journalctl --no-pager --reverse --unit <systemd-unit>.service`. It often feels like Linux tools pick silly defaults or otherwise go out of their way to have a counterintuitive UI (I have _plenty_ of criticism for Docker's UI as well, but it's still better than systemd IMHO). This is the biggest issue for me--Docker doesn't make me spend so much time reading man pages or managing bash aliases, and for me that's worth its weight in gold.

Yuuup! I'm super-small time, so for me it's just `scp *.deb $TARGET:.` (no PPA, although I'm considering it...)

Really, my package is currently mostly: `Depends: git, jq, curl, vim, moreutils, etc...` (ie: my per-user "typically installed software"), and I'm considering splitting out: `personal-cli`, `personal-gui` (eg: Inkscape, vlc, handbrake, etc...), and am about to have to dive in to systemd stuff for `personal-server`, which will do all the caddy, https, and probably cgi-bin support (mostly little home automation scripts / services).

I'm 100% with you w.r.t. the sudo journalctl garbage, but if you poke at cockpit https://www.redhat.com/sysadmin/intro-cockpit - it provides a nice little GUI which does a bunch of the systemd "stuff". That's kindof the nice tag-along ecosystem effects of "just be a package".

I'm definitely relatively happy with docker overall, but there's useful bits in being more closely integrated with the overall package system management (apt install ; apt upgrade ; systemctl restart ; versions, etc...), and the complexity that you learn is durable and consistent across the system.

In situations at work where we use something as an alternative to Docker as a deployment target, it's Nix. That has its own problems and we can talk about them, but in the context of that alternative I think some of your points are kinda backwards.

> 1. Not needing to run my own PPA server (not super hard, it's just a little more friction than using Docker hub or github or whatever)

Docker actually has more infrastructure requirements than alternatives. For instance, we have some CI jobs at work whose environments are provided via Nix and some whose environments are provided by Docker. The Docker-based jobs all require management of some kind of repository infrastructure (usually an ECR). The Nix-based jobs just... don't. We don't run our own cache for Nix artifacts, and Nix doesn't care: what it can find in the public caches we use, it does, and it just silently and transparently builds whatwver else it needs (our custom packages) from source. They get built just once on each runner and then are reused across all jobs.

> 2. Figuring out how to make a deb package is almost always harder in practice for real world code than building/pushing a Docker container image

Definitely depends on the codebase, but sure, packaging usually involves adhering to some kind of discipline and conventions whereas Docker lets you splat files onto a disk image via any manual hack that strikes your fancy. But if you don't care about your OCI images being shit, you might likewise not care about your DEB packages being shit. If that's the case, you can often shit out a DEB file via something like fpm with very little effort.

> 3. I really hate reading/writing/maintaining systemd units. I know most of the time you can just copy/paste boilerplate from the Internet or look up the docs in the man pages. Not the end of the world, just another pain point that doesn't exist in Docker.

> 4. The Docker tooling is sooooo much better than the systemd/debian ecosystem. `docker logs <container>` is so much better than `sudo journalctl --no-pager --reverse --unit <systemd-unit>.service`. It often feels like Linux tools pick silly defaults or otherwise go out of their way to have a counterintuitive UI (I have _plenty_ of criticism for Docker's UI as well, but it's still better than systemd IMHO). This is the biggest issue for me--Docker doesn't make me spend so much time reading man pages or managing bash aliases, and for me that's worth its weight in gold.

I don't really understand this preference; I guess we just disagree here. Systemd has been around for like a decade and a half now, and ubiquitous for most of that time. The kind of usage you're talking about is extremely well documented and pretty simple. Why would I want a separate, additional interface for managing services and logs when the systemd stuff is something I already have to know to administer the system anyway? I also frequently use systemd features that Docker just doesn't have, like automatic filesystem mounts (it can do some things fstab can't), socket activation, user services, timers, dependency relations between units, descri ing how services that should only come up after the network is up, etc. Docker's tooling really doesn't seem better to me.

> Docker actually has more infrastructure requirements than alternatives.

I was mostly comparing Docker to system packages, and I was specifically thinking about how trivial it is to use Docker Hub or GitHub for image hosting. Yeah, it's "infrastructure", but it's perfectly fine to click that into existence until you get to some scale. I would rather do that than operate a debian package server. Agreed that Nix works pretty well for that case, and that it has other (significant) downsides. I'm spiritually aligned with Nix, but Docker has repeatedly proven itself more practical for me.

> Definitely depends on the codebase, but sure, packaging usually involves adhering to some kind of discipline and conventions whereas Docker lets you splat files onto a disk image via any manual hack that strikes your fancy. But if you don't care about your OCI images being shit, you might likewise not care about your DEB packages being shit. If that's the case, you can often shit out a DEB file via something like fpm with very little effort.

I'm not really talking about "splatting files via manual hack", I'm talking about building clean, minimal images with a somewhat sane build tool. And to be clear, I really don't like Docker as a build tool, it's just far less bad than building system packages.

> don't really understand this preference; I guess we just disagree here. Systemd has been around for like a decade and a half now, and ubiquitous for most of that time.

Yeah, I don't dispute that systemd has been around and been ubiquitous. I mostly think it's user interface is hot garbage. Yes, it's well documented that you can get rid of the pager with `--no-pager` and you can put the logs in a sane order with `--reverse` and that you specify the unit you want to look up with `--unit`, but it's fucking stupid that you have to look that stuff up in the man pages at all never mind type it every time (or at least maintain aliases on every system you operate) when it could just do the right thing by default. And that's just one small example, everything about systemd is a fractal of bad design, including the unit file format, the daemon-reload step, the magical naming conventions for automatic host mounts, the confusing and largely unnecessary way dependencies are expressed, etc ad infinitum.

> Why would I want a separate, additional interface for managing services and logs when the systemd stuff is something I already have to know to administer the system anyway?

I mean, first of all I'm talking about my preferences, I'm not trying to convince you that you should change, so if you know and like systemd and you don't know Docker, that's fine. And moreover, I hate that I have to choose between "an additional layer" and "a sane user interface", but having tried both I've begrudgingly found the additional layer to be the much less hostile choice.

> I also frequently use systemd features that Docker just doesn't have, like automatic filesystem mounts (it can do some things fstab can't), socket activation, user services, timers

Yeah, I agree that Docker can't do those things. I'm not even sure I want it to do those things. I'm talking pretty specifically about managing my application processes. But yeah, since you mention it, fstab is another technology that has been around for a long time, is ubiquitous, and is still wildly, unnecessarily hostile to users (it can't even do obvious things like automounting a USB device when it's plugged in).

> ... dependency relations between units, descri ing how services that should only come up after the network is up, etc. Docker's tooling really doesn't seem better to me.

Docker supports dependency relations between services pretty well, via its Compose functionality. You specify what services you want to run, how to test their health, and how they depend on each other. You can have Docker restart them if they die so it doesn't really matter if they come up before the network (but I've also never had a problem with Docker starting anything before the network comes up)--it will just retry until the network is ready.

Docker's tooling is better in its design, not necessarily a more expansive featureset. It has sane defaults, so if you do `docker logs <container>` you get the logs for the container without a pager and sorted properly--you don't need to remember to invoke `sudo` or anything like that assuming you've followed the installation instructions. Similarly, the Compose file format is much nicer to work with than editing systemd units--I'm not huge fan of YAML, but it's much better than the INI format for the kind of complex data structures required by the domain. It also doesn't scatter configs across a bunch of different files, it doesn't require a daemon-reload step, the files aren't owned by root by default, they're not buried in an /etc/systemd/system/foo/bar/baz tree by default, etc.

Like I said, I don't think Docker is perfect, and I have plenty of criticism for it, but it's far more productive than dealing with systemd in my experience.

This is the way. And truthfully if you can learn to package for Debian, you already know how to package for Ubuntu and you can easily figure out how to package for openSUSE or Fedora or Arch.
Even `alien` or I think ~suckless package manager~ `fpm` for 90% of things.
Option 1: python3 -m venv venv > source project/venv/bin/activate

Option 2: use Poetry

How is this different than a Dockerfile that is creating the venv? Just add it to beginning, just like you would on localhost. But that is why I love to code Python in PyCharm, they manage the venv in each project on init.

My comment about pip is orthogonal to Docker. This is the same with or without Docker - I added a comment on this thread with more detail.
> why use Docker at all?

We have a simple cloud infrastructure. Last year, we moved all our legacy apps to a Docker-based deployment (we were already using Docker for newer stuff). Nothing fancy—just basic Dockerfile and docker-compose.yml.

Advantages:

- Easy to manage: we keep a repo of docker-compose.yml files for each environment.

- Simple commands: most of the time, it’s just "docker-compose pull" and "docker-compose up."

- Our CI pipeline builds images after each commit, runs automated tests, and deploys to staging for QA to run manual tests.

- Very stable: we deploy the same images that were tested in staging. Our deployment success rate and production uptime improved significantly after the switch—even though stability wasn’t a big issue before!

- Common knowledge: everyone on our team is familiar with Docker, and it speeds up onboarding for new hires.

I think a lot of (justifiable) Docker use comes out of being forced to use other tools & ecosystems that are fundamentally messy and not really intended for galactic-scale enterprise development.

I have found that going all-in with certain language/framework features, such as self-contained deployments, can allow for really powerful sidestepping of this kind of operational complexity.

If I was still in a situation where I had to ensure the right combination of runtimes & frameworks are installed every time, I might be reaching for Docker too.

Python, Ruby, and to a much larger extent PHP are the Docker showcase!

For example, if you have a program that uses wsgi and runs on python 2.7, and another wsgi program that runs on python 3.16, you will absolutely need 2 different web servers to run them.

You can give different ports to both, and install an nginx on port 80 with a reverse proxy. But software tends to come with a lot of assumptions that make ops hard, and they will often not like your custom setup... but they will almost certainly like a normal docker setup.

Dockerfiles compose and aren't restricted to running on linux. Those two reasons alone basically mean I never need to care about systemd again
Yeah, not caring about systemd is a big win for me. And I don't just mean the cryptic systemd unit syntax, but also the absolutely terrible ux of every CLI tool in the suite. I'm tired of having to pass half a dozen flags every time I want to view the logs of a systemd unit (or forgetting to type `sudo` before `systemctl`). I'm tired of having to remember the path to the systemd unit files on each system whenever I need to edit the files (is it `etc/systemd/system/...` or `etc/system/systemd/...`?). Docker is far from perfect, but at least it's intuitive enough that I don't have to constantly reference man pages or manage aliases.

I would love to do away with the Docker layer, but first the standard Linux tooling needs to improve a lot.

Honestly most people's dockerfile could just as well be a bash script.
I find Dockerfile's even simpler to work with than bash scripts.
Thing is, for many people they are just bash scripts with extra steps.
I am under the impression that those using Docker are those using shitty interpreted languages that fail hard on version incompatibilities, with Docker being used for version isolation as a workaround. How would a bash script help?
You don't run a Dockerfile on every machine, and a bash script doesn't produce an image. They're not even solving the same problem.
So many people only need one machine. And these people certainly don't need an image.
Exactly! This person gets it.

Oh, and not only build their app, they can take it a step further and setup the entire new vps and app building in one simple script!

I feel y’all are too focused on the end product.

I deploy to pared down bare metal, but I use containerization for development, both local and otherwise, for me and contributors.

So much easier than trying to get a local machine to be set up identically to a myriad of servers running multiple projects with their idiosyncratic needs.

I like developing on my Qubes daily driver so I can easily spin up a server imitating vm, but if I’m getting your help, especially without paying you, then I want development for you to be as seamless as possible whatever your personal preferred setup.

I feel containerization helps with that.

Once you do it for long enough it might be worth it to consider configuration management where you declare native structured resources (users, firewall rules, nginx reverse proxies, etc) rather than writing them in shell.

I use Puppet for distribution of users, firewall rules, SSH hardening + whitelisting, nginx config (rev proxy, static server, etc), Let's Encrypt certs management + renewal + distribution, PostgreSQL config, etc.

The profit from this is huge once you have say 20-30 machines instead of 2-3, user lifecycle in the team that needs to be managed, etc. But the time investment is not trivial - for a couple of machines it is not worth it.

Honestly not having to use Puppet or Ansible are among my reasons for using Docker. I do some basic stuff in cloud-init (which is already frustrating enough) to configure users, ssh, and docker and everything else is just standard Docker tooling.
You might be right - maybe Apple's poorly operated bug bounty is a result of incompetence rather than intentional malice.

But does that matter to security researchers or the public? No. Apple needs to get their bounty program in order regardless of the reason it's broken.

Ultimately, this blog post is just another example on the already large pile[1][2][3][4][5]

1: https://arstechnica.com/information-technology/2021/09/three...

2: https://mjtsai.com/blog/2021/07/13/more-trouble-with-the-app...

3: https://medium.com/macoclock/apple-security-bounty-a-persona...

4: https://theevilbit.github.io/posts/experiences_with_asb/

5: https://shail-official.medium.com/accessing-apples-internal-...

> I do some basic stuff in cloud-init (which is already frustrating enough)

What do you find frustrating about cloud-init? I'm relatively new to it.

The YAML structure seems poorly thought out, the documentation is low quality, the iteration loop is tedious, etc.
I'm doing it :)

I split it into multiple scripts that get called from one, just for my own sanity.

Because it seems unobvious but docker always saves you. It's actually quicker than running pip install requirements.txt once you get a year in. (Trust me, I used to take your approach).

Forget about "clunky overhead" - the running costs are < 10%. The dockerfile? You don't even need one. You can just pull from the python version you want e.g. Python1.11 and git pull you files from the container to get up and running. You don't need to use container image saving systems, you don't need to save images, or tag anything, you don't need to write set up scripts in the docker file, you can pass the database credentials through the environment option when launching the container.

The problem is after a year or two you get clashes or weird stuff breaking. And modules stopping support of your python version preventing you installing new ones. Case in point, Googles AI module(needed for gemini and lots of their AI API services) only works on 3.10+. What if you started in 2021? Your python - then cutting edge - would not work anymore, it's only 3.5 years later from that release. Yeah you can use loads of curl. Good luck maintaining that for years though.

Numpy 1.19 is calling np.warnings but some other dependence is using Numpy 1.20 which removed .warnings and made it .notices or something

Your cached model routes for transformers changed default directory

You update the dependencies and it seems fine, then on a new machine you try and update them, and bam, wrong python version, you are on 3.9 and remote is 3.10, so it's all breaking.

It's also not simple in the following respect: your requirements.txt file will potentially have dependency clashes (despite running code), might take ages to install on a 4GB VM (especially if you need pytorch because some AI module that makes life 10x easier rather needlessly requires it).

life with docker is worth it. i was scared of it too, but there are three key benefits for the everyman / solodev:

- Literally docker export the running container as a .tar to install it on a new VM. That's one line and guaranteed the exact same VM, no changes. That's what you want, no risks.

- Back up is equally simple; shell script to download regular back ups. Update is simple; shell script to update git repo within the container. You can docker export it to investigate bugs without affecting the production running container, giving you an instant local dev environment as needed.

- When you inevitably need to update python you can just spin up a new VM with the same port mapping on Python 3.14 or whatever and just create an API internally to communicate, the two containers can share resources but run different python versions. How do you handle this with your solution in 4 years time?

- If you need to rapidly scale, your shell script could work fine, I'll give you that. But probably it takes 2 minutes to start on each VM. Do you want a 2 minute wait for your autoscaling? No you want a docker image / AMI that takes 5 seconds for AWS to scale up if you "hit it big".

Clunky overhead from Docker?

Sorry, but you've got no idea what you're talking about.

You can also run OSI images, often called docker images directly via systemds nspawn. Because docker doesn't create an overhead by itself, its at its heart a wrapper around kernel features and iptables.

You didn't need docker for deployments, but let's not use completely made up bullshit as arguments, okay?

I have no idea what I am talking about? Docker is literally adding middleware between your Linux system and app.

That doesn't necessarily mean there aren't Pro's to Docker, but one Con to Docker is - it's absolutely overhead and complexity that is not necessary.

I think one of the most powerful features of Docker by the way is Docker Compose. This is the real superpower of Docker in my opinion. I can literally run multiple services and apps in one VPS / dedicated server and have it manage my network interface and ports for me? Uhmmm...yes please!!!! :)

Docker's runtime overheads on Linux are tiny. It's pretty much all implemented using namespaces, cgroups and mounts which are native kernel constructs.
Well designed, written and efficient...middleware. It's a wrapper around linux and a middle between my OS and my app! A spade is a spade.

There are cons beyond performance. For example Docker complexity - you need to learn a new filetype, a new set of commands, a new architecture, new configurations, spend hours reading another set of documentation. Buy and read another 300 page O'Reily book to master and grasp something that again has Pro's and Con's.

For me? It's not necessary and I even know some Docker Kung-Fu but choose not to use it. I do use Docker Desktop occasionally to run apps and services on my localhost - it's basically a Docker Compose UI, and I really enjoy it.

> It's a wrapper around linux and a middle between my OS and my app

No. Docker doesn't "wrap" anything, and it certainly does not wrap Linux. Please reconsider looking at the documentation. It uses native kernel features. SystemD does a similar thing.

> For example Docker complexity - you need to learn a new filetype, a new set of commands, a new architecture, new configurations, spend hours reading another set of documentation

I can't say I agree.

A wrapper CLI that produces the same outcome wouldn't really be considered middleware, which surely should affect runtime?
Docker is native Linux. Your app uses the same kernel as the host. Is "chroot" middleware? No. Neither is docker.
It does require a running daemon. Other solutions, like podman, do not. There is an overhead associated with docker.
> Docker is literally adding middleware between your Linux system and app.

Not really, no. Docker just uses functionality provided by the Linux kernel for its exact use case. It's not like a VM.

> it's absolutely overhead and complexity that is not necessary.

This is demonstratively wrong. Docker introduces less complexity compared to system native tools like Systemd or Bash. Dockerfiles will handle those for you.

> I have no idea what I am talking about

I wouldn't say that. You seem to have strong puritarian opinions tough.

O rly, pray tell, which middleware?

Your most powerful feature is literally a hostfile that docker generates on container start that's saved at /etc/hosts + Iptables rules

Edit: and if you don't want them, use Network-Mode: host and voila, none of that is generated

>have it manage my network interface and ports for me

...and bypass the host firewall by default unless you explicitly bind stuff to localhost :-/

I don't particularly love or hate docker, but when I realized this, I decided to interact with it as little as possible for production environments. Such "convenient" defaults usually indicate that developers don't care about security or integrating with the rest of the system.

> docker doesn't create an overhead by itself

Yes it does, the Docker runtime (the daemon which runs under root) is horribly designed and insecure.

Insecure in what way? Rootful docker is a mature product that comes with seccomp and standard apparmor policies ootb!
It runs as root, requires sudo to use, turns off all system firewalls, and has no way of doing security updates for containers.
> It runs as root

A lot of system applications on a standard Linux machine run as root or run with rootful permissions. This problem is solved by sandboxing, confining permissions and further hardening.

> requires sudo to use

Yes. However, this is a security plus and not a disadvantage.

> turns off all system firewalls

This statement makes no sense.

> has no way of doing security updates for containers.

I don't know what you mean by this.

There isn't a "Docker runtime", and the daemon is not a runtime any more than systemd is a runtime. They're both just managing processes. If you want to argue that Docker containers have an overhead, you could maybe argue that the Linux kernel security features they employ have an additional overhead, but that overhead is likely to be marginal compared to a less secure approach and moreover since you're Very Concerned About Security™ I'm sure you would prefer to pay the security cost.
Duplicating a base Linux distribution a thousand times for every installed piece of software absolutely is overhead.

(Theoretically you could build bare images without pulling in Alpine or Ubuntu, but literally almost nobody ever does that. If you have the skills to build a bare Docker image then you don't need Docker.)

> Duplicating a base Linux distribution a thousand times for every installed piece of software absolutely is overhead.

You're not duplicating an entire distribution, just the user land that you want. Typically we use minimal user lands that just have certs and /etc/passwd and maybe `sh`. And to be clear, this is mostly just a disk overhead, not a CPU or memory performance overhead.

> Theoretically you could build bare images without pulling in Alpine or Ubuntu, but literally almost nobody ever does that

Yeah, we do that all the time. Google's "distroless" images are only about 2MiB. It's very commonly used by anyone who is remotely concerned about performance.

> If you have the skills to build a bare Docker image then you don't need Docker.

Building a bare Docker image isn't hard, and the main reason to use Docker in a single-host configuration is because Docker utilities are just far, far saner than systemd utilities (and also because it's just easier to distribute programs as a Docker images rather than having to deal with system package repos and managers and so on).