Hacker News new | ask | show | jobs
by tadbit 1522 days ago
I love stuff like this.

People will remark about how this is a waste of time, others will say it is absolutely necessary, even more will laud it just for the fun of doing it. I'm in the middle camp. I wish software/systems engineers would spend more time optomising for size and performance.

3 comments

Wouldn't removing Docker entirely be a good optimization?
Docker adds other value to the lifecycle of your deployment. An "optimization" where you're removing value is just a compromise. Otherwise we'd all run our static sites on UEFI.
Redbean supports UEFI too. Although we haven't added a bare metal implementation of berkeley sockets yet. Although it's on the roadmap for the future.
oh wow are you justine?

i've been meaning to ask you this for a decade. whatever happened to when you wrote a blog with insanely irritating serifs that connected certain letters together? what was the rationale behind that? never seen it since

I'm insanely impressed by APE and redbean by the way, blows OP out of the water!

Oh you mean the blog with the long s? I was reading a lot of books at the time that were written before 1800 and I found it so fascinating how different typography was back then. I found a font I could pay for called Quant that did a really good job reproducing archaic ligatures and the long s, so I used it on a blog for a short period of time. Sadly it got negative feedback. So lately I've been focusing on https://justine.lol/ which uses Roboto. I'm glad to hear you're enjoying it!
ahhh ligatures, that's the word I was looking for here. Yeah it was kind of irritating but good blog content so I just read it anyways. it was kind of hard not to read it as someone lisping everything though.
This is a really good point, and something I think a lot of people forget. It's true, the most secure web app is one written with no code/no OS/does nothing.

Adding value is a compromise of some increased security risk - and it's our job to mitigate that as much as possible by writing quality software.

What value is that, for running such a simple piece of software?
You can have multiple instances of the server running on the machine without interfering with each other.

You can limit file system access for the server to only a certain folder.

You can similarly limit port access and manage conflicts (e.g. multiple servers can think they are listening on a certain open port but those are mapped to something else on the host).

If you have multiple machines with different operating systems or even architecture you can deploy your server as a container more easily on them without needing to rebuild or test for each one.

You can have the same environment running locally while development or on CI servers without complicated setups.

The system can scale out a lot more easily to hundreds/thousands of machines if you decide to use something like Kubernetes.

A few off the top of my head.

The ability to pull the image on to any machine without needing to clone the source files and build it.

Smaller images mean faster pod starts when you auto scale.

You have to login to some docker repository anyways and know the series of commands to actually run it. Cloning a repo and running a shell script is probably a lot easier and faster than that.

What kind of work are you doing that requires really fast auto scaling? Is a few minutes to spin up a new instance really that cumbersome? Can you not signal for it to spin up a new instance a tiny bit earlier than when it's needed when you see traffic increases?

> You have to login to some docker repository anyways and know the series of commands to actually run it. Cloning a repo and running a shell script is probably a lot easier and faster than that.

In isolation, yes. But if, for instance, you're already running a container orchestration tool with hundreds of containers, and have CI/CD pipelines already set up to do all of that, it's easier just to tack on another container.

Ok when you say a few off the top of my head it implies that there are a bunch and these are like some super obvious ones, but it sounds like this is actually only useful if you have a bunch of infrastructure set up to serve sites for projects and customers that need containerization and then you just throw this simple little static site docker instance in there because when you're maintaining a lot of docker instances it is just simpler to do?

Which seems like sort of an edge case for value adding, and makes me feel like it really doesn't add any value to do this unless you already are doing it for everything, and thus you really wouldn't be throwing out any value by just serving the static site without the docker overhead.

Adding to some of the other responses, one reason I chose to deploy a SPA I'm working on as a Docker image is atomicity - if I want to deploy a newer version I simply switch out the tag in my container orchestrator's config (Nomad in this case, but the same principles apply to k8s and friends) and it's guaranteed that the new deployment will be pristine, without the risk of leftover files from a rsync or similar - and if I need to roll back I do the exact same.
There’s value in that, but you don’t need Docker with its related debugging and maintenance overhead to get it. NixOS, among other tools, will do the same thing while constructing a “flat” operating system image.

Anything else, though? There’s got to be more to it than that, or it wouldn’t be as popular as it is.

yeah see some of us still do this on OSes that haven't turned into a giant bloated hodgepodge of security theatre and false panacea software.

docker has dead whale on the beach vibes. what value does it offer to those of us who have moved on from the mess linux is becoming?

I’m not suggesting it has value to everyone. I’m suggesting it has value to the people who see value in it.
I'm super curious to know what the value to people who see that happens to be. It's serving static websites, why do I need to wrap THAT of all things in a container?

Really, enlighten me

> why do I need to wrap THAT of all things in a container?

If you can't see a reason why, then you probably don't need to. You probably have different needs than other people.

Many people use Docker not because of what they're doing inside of the container, but because it is convenient for tangential activities. Like lifecycle management, automation, portability, scheduling, etc.

I have several static sites in Docker containers in production. We also have dozens of other microservices in containers. We could do everything the same way, or we can one-off an entirely separate architecture for our static sites. The former makes more sense for us.

Because you want a reproducible environment/runtime for that static server. Nix/NixOS takes it a step further, in that it provides not only a reproducible runtime environment, but a reproducible dev and build environment as well.
Once you've gone the container route you no longer even need to think about virtual servers. You can just deploy it to a container service, like ECS.
I actually found myself needing something like this a couple weeks ago. I use a self-hosted platform (cloudron.io) that allows for custom apps. I wanted to host a static blog on that server. Some people are happy to accept "bloat" if it does, in fact, make life easier in some way.
If you literally ONLY ever need to run a single static website, then yeah, containers might not be helpful to you.

But once you start wanting to run a significant number of things, or a significant number of instances of a thing, it becomes more helpful to have a all-purpose tool designed to manage images & run instances of them. Having a common operational pattern for all your systems is a nice, overt, clean, common practice everyone can adapt & gain expertise in. Rather than each project or company defining it's own deployment/operationization/management patterns & implementations.

The cost of containers is also essentially near zero (alas somewhat less true with regards to local FS performance, but basically equal for many volume mounts). They come with great features like snapshots & the ability to make images off images- CoW style capabilities, the ability to mix together different volumes- there's some really great operational tools in containers too.

Some people just don't have real needs. For everyone else...

Out of curiosity, what OS have you moved on to?
OpenBSD for the past 10 years or so has been really good to me and my clients, and it just keeps on getting better while linux keeps on getting worse. It's kind of a nobrainer these days.

Hell if you just need to serve static HTTP it even has its own built in webserver now:

https://man.openbsd.org/httpd

In terms of CPU cycles and disk space, maybe. In terms of engineer cycles, absolutely not. Which costs more?
Hmm, a SCP shell script on my laptop, prompting my SSH key's password and deploying the site to the target machine?

Or a constantly-updating behemoth, running as root, installing packages from yet another unauditable repository chain?

You forgot the step where you had to provision that server to run the software and maintain all the systems security updates on the live running server, and that server requires all the same maintenance, with or without docker. And if you fuck it up, better call the wife and cancell Sunday plans because you forgot how it all gets installed and ......yeah, just use docker :p
Debian offers unattended upgrades: https://wiki.debian.org/UnattendedUpgrades

And security updates, as you said, are needed regardless of whether you run Docker on top. I think Docker is a needless complexity and security risk.

Security updates are only needed on the OS level if you're running Docker on bare metal or a VPS. If you're running Docker in a managed container or managed Kubernetes service such as ECS/EKS, you only need to update the Docker image itself, which is as simple as updating your pip/npm/maven/cargo/gem/whatever dependencies.

I see two main places where Docker provides a lot of value: in a large corp where you have massive numbers of developers running diverse services on shared infrastructure, and in a tiny org where you don't have anyone who is responsible for maintaining your infrastructure full time. The former benefits from a standardized deployment unit that works easily with any language/stack. The latter benefits from being able to piggy-back off a cloud provider that handles the physical and OS infrastructure for you.

And you're welcome to think so, but if you intend to make a case for removing Docker as optimization, you still have yet to start.
The first option is something custom that you had to write yourself and remember how to use years later, or explain how to use to others

The second option is standardized and usually the same 1 or 2 commands to run anywhere

Building simpler systems allows you to save on all three.
That's true, but in my experience there is nothing mutually exclusive in systems being simple and systems running Docker.

Granted, you do need to learn how Docker works, and be ready to help others do likewise if you're onboarding folks with little or no prior experience of Docker to a team where Docker is used. That's certainly a tradeoff you face with Docker - just as with literally every other shared tool, platform, codebase, language, or technological application of any kind. The question that wants asking is whether, in exchange for that increased effort of pedagogy, you get something that makes the increased effort worthwhile.

I think in a lot of cases you do, and my experience has borne that out; software in containers isn't materially more difficult to maintain than software outside it if you know what you're doing, and in many cases it's much easier.

I get that not everyone is going to agree with me here, nor do I demand everyone should. But it would be nice if someone wanted to take the time to argue the other side of my claim, rather than merely insisting upon it with no more evident basis than arbitrarily selected first principles given no further consideration in the context of what I continue to hope may develop into a discussion.

Docker is absolutely ups the complexity.

Whatever set-up your application needs is a still necessary step in the process. But now you've not only added more software in docker with its a docker registry, and Docker's state on top of the application's state, you've also introduced multiple virtual filesystems and a layer of mapping between those and locations on the host, mappings between the container's ports and the host's ports. There is no longer a single truth about the host system. The application may see one thing and you, the owner, another. If the application says "I wrote it to /foo/bar", you may look in "/foo/bar" and find that /foo doesn't even exist.

All of that is indirection and new ways things can be that did not exist if you just ran your code natively. What is complexity if not additional layers of indirection and the increase of ways things can be?

Okay, and in exchange for that, I've gained single-command deployments of containers that already include all the dependencies their applications require, and at most I only have to think about that when I'm writing a deployment script or doing an update audit.

It's rare that I need to find out de novo where a given path in a container is mapped on the host. When I do need to do that, I can usually check a deployment script, or failing that inspect the container directly and see what volume mounts it has.

I don't need to worry about finding paths very often - much less frequently than I need to think about deployments, which at absolute minimum is once per project.

So, sure, by using Docker I've introduced a little new complexity, that's true. But you overlook that this choice does not exist in a vacuum, and that that added complexity is more than offset by the reduction of complexity in tasks I face much more often than the one you describe.

And that's just me! These days I have a whole team of engineers on whose behalf, as a tech lead, I share responsibility for maintaining and improving developer experience. Do you think I'd do them more of a favor by demanding they all comprehend a hundred-line sui generis shell script for deployments, or by saying "here's a single command that works in exactly the same way everyone you'll work with in the next ten years does it, and if it breaks there's fifty people here who all know how to help you fix it"?

Does it? Or, rather, is it even simpler?

To host something as a docker container I need 2 things: to know how to host docker, and a docker image. In fact, not even an image, just a dockerfile/docker-composer.yaml in my source code. If I need to host 1000 apps as a docker containers, I need 1000 dockerfiles and still to know (and remember) 1 thing: how to host docker. That's 1 piece of knowledge I need to keep in my head, and 1000 I keep on a hard-drive, most of the time not even caring what's the instruction inside of them.

If I need to host 1000 apps without dockerfiles, I need to keep 1000 pieces of knowledge in my head. thttpd here, nginx to java server there, very simple and obvious postgres+redis+elastic+elixir stack for another app… Yeah, sounds fun.

I think the real value is just focusing on the absolute minimum necessary software in a production docker/container image. It's a good practice for security with less surface area for attackers to target.
The difference between a systems engineer and a software engineer is that to a systems engineer a half functioning 5MB docker image is okay but to a software engineer a fully functional 5GB Node image is fine.
Premature optimisation? 5 gb doesn’t matter. It’s not great, don’t get me wrong.