Hacker News new | ask | show | jobs
by chmod775 2025 days ago
Call me old fashioned, but I ship both my private projects and those at work as debian packages.

Debian packages are trivial to put into a container, and we tried that, but honestly it's not half as nice to work with.

With containers you have to do a ton of extra steps to get functionality and debugging on a level a default debian system provides you.

Additionally the tools to automate the installation and configuration of debian systems are way more mature compared to docker et al.

Containers aren't quite there yet.

6 comments

I'm really considering doing this for a new spin-up myself... The stability of Debian tooling for large-scale, hyper scaleable solutions is outstanding. I just feel like the world balks at me every time I do something considered slightly "old fashioned" completely ignoring the finished product to shame me for not using buzz-word tooling.

Containerization is fantastic don't get me wrong, but I've had more success with old-school approaches to package management, deployment, optimization, debugging, etc. running thin Debian servers. Just... prod ops is easier and more stable at the end of the day. I really don't see the need to containerize everything outside of cross-platform development tooling. I also really prefer having a semblance of an OS/bash terminal when it comes to ops!

Also: this is purely anecdotal. And, to get ahead of the folks yelling "you just don't understand Docker and K8s" - yes I do. I still think they're great, I just am not fully sold on them for every use-case.

Debian plus an automation system like Puppet, Chef, etc works extremely well.

It's just not sexy enough for people to write hundreds of posts about how to set up your own package repo, understand unattended-upgrades, and do monitoring.

> Debian plus an automation system like Puppet, Chef, etc works extremely well.

Eh, it does until it doesn't. Sooner or later you run into pitfalls around the leaky abstraction of pretending your state is truly idempotent and path-independent. E.g. spinning up a new instance works fine, but the existing instances that need to uninstall a previous version to upgrade to the newer one end up breaking. Or vice-versa, existing servers work fine but then when you need to launch a new one your realize the config no longer works on a clean install and you hadn't noticed it for weeks.

Container-based systems certainly have their own problems, but it is really nice having a model where you don't allow long-lived implicit state and cruft to accumulate on your application servers in the same way.

If I don't understand what I'm doing, it doesn't matter what tools I'm using to fail to accomplish it.
Ah, the myth of the perfect programmer.

Everyone makes mistakes, and state resets such as VMs or containers are one of the easiest ways to revert these mistakes.

Everyone makes mistakes, and having real testing infrastructure is the best way to catch it.

Many mistakes are made against databases; maybe you can roll back, maybe you can't -- either way, it's cheaper in testing than production.

Except when your Puppet/chef whatever aren’t set up carefully to be self-contained to be able to bring up the system to the correct state regardless of whatever interim state it might be in. It’s not impossible but it’s also not an out-of-the-box thing, it’s complicated to pull off, and usually requires constant expertise rather than being a playbook you hand off to whomever (even with expertise you can have bugs, you just aren’t in as big of a hole because you’ve ratholed on the assumption that you always have version 0 or version N-1 to get to state N).

Doing this is hard and there’s nothing in the ecosystem I’m aware of to guarantee that. Nix maybe? That’s part of the problem. The other is that docker has a conveniently large set of configs and things “out of the box” so there’s familiarity and documentation.

It’s not impossible to accomplish but building that community of doing things a sustainable way that actually addresses the pain points (rather than just dismissing it with “you’re just using the old tools incorrectly”) is what you need. If you’re going after docker your solution will have to support devs and devops. If you’re going for a niche community of enthusiasts/experts (probably more defensible and easier to grow), then focus just on a single niche use-case that general solutions could never outmatch (but don’t ignore growing it carefully if your niche solution is meaningfully better - listen to your users that you trust).

Re: Containers aren't quite there yet.

I think they've been there for 12-14 months. It's no longer a question of "If" but "When" a company decides on its container strategy (and its more than just k8s - see https://blog.coinbase.com/container-technologies-at-coinbase...)

I work for a company that is 100% k8s. Base linux of the containers is Debian 8- but honestly doesn't really matter that much - the OS is more kubectl and the orchestration around k8s (GKE, Prometheus, Sysdig, Grafana, ELK) - the "operating system" has moved up a stack.

When I was working in a stack like this I found people spending outstanding amounts of time not actually working to improve the stability/performance of the application. The reason you triggered that memory was "GKE, Prometheus, Sysdig, Grafana, ELK" - that's exactly what we were dealing with. The support infrastructure/compute needs for it far exceeded the 20-30 hosts that actually needed to be there to operate the application.

We either were using someone else's prebuilt orchestration for something like ELK (insecure, needs constant auditing to be OK) or rolling it ourselves (very expensive in engineer time). None of it was ever working 100% and that was because we were jumping at software packages no one had really taken the time to fully understand. The mentality was "it's containerized!" which many on my team took to mean "we don't need to really grok it, it's in a container!" That burnt us, both on our TIG and ELK stacks. I left that job because it became putting out dumb fires that were not business-justifiable.

All-in-all I'm not saying what anyone is doing is wrong, I'm just saying that if you're going for an orchestrated environment like this you have to have a very mature team. You have to really care about learning these services well, and you have to be careful to not let your own architecture take your time away from solving real problems for the business.

The team I was on did not have that maturity outside of a couple bitter/broken ops guys who didn't deserve what the team had done to them while buzz-word driven leadership gutted their very-proven and stable VMWare infra into a total cluster-f K8s setup because "that's what we're suppose to do in 2018! That's what the new engineers want to work in!"

> the "operating system" has moved up a stack

Splitting hairs: The OS is still the same. The "stack" is newly imposed abstraction on-top of already established paradigms where we are trying to abstract ourselves away from the OS. It's distributed compute more than it is the "OS moving up a stack".

Edit: Ha I think you may have edited your comment with the Coinbase article. That article is actually what I point people to when explaining that K8s isn't some golden bullet, I personally think Coinbase is a great compromise in leveraging containers without going off of the rails (as they write about, ex: talking about the need for dedicated "compute" teams etc).

"outside of a couple bitter/broken ops guys who didn't deserve what the team had done to them"

Hey, that's me.

Well, if its any consolation, programmers aren't safe either. In fact, we are highly paid obnoxious people (to execs) and just imagine being able to replace one of us with a box that works 24 hours, 7 days a week for the cost of the hardware, electricity, and network. How exciting! If and when that happens, I suppose I can find work as a (bad) carpenter or something.
I mostly agree with all of what you've said here. In our case, it's not unusual for a single customer environment to surge to 200-300 instances of an underlying compute server, and then scale back down to 20-30 at steady state. With 30 customer environments, you might have customers running from anywhere as low as 15 containers to as many as 500+, with a lot of dynamic flux depending on data ingestion and ETL.

K8S is in flux, so you still have to have a few top-end SRE types to manage your kube environment - the acceleration / maturity of the ecosystem is incredible though, so, sometime in the next 3-4 years, we'll start to see things get standardized enough that the wizardry required to keep it running will become a more commodity skill set.

And, more importantly, most of the ecosystem is fairly identical between azure/google/AWS - so porting or going multi-cloud is usually a weeks effort if that's something you want to do.

By "Moving up the Stack" - Of course I understand that cgroups/linux underpins it all - it's just that we're not using linux system binaries to manage the containers directly.

I mean tasks like process, storage, memory, CPU, resource utilization isn't something we tweak/query with OS commands, rather we're sending request/limit configurations to kube, and let it worry about managing the resources, relying on PromQL to monitor resource utilization, etc...

> we'll start to see things get standardized enough that the wizardry required to keep it running will become a more commodity skill set

I am so ready for this!

Constant scale-up/scale-down and dynamic load is what I jump at K8s for personally. Totally see the use-case for what you're talking about.

All-in-all I love K8s and containers, use them myself, and have been really happy with the results. It's just when I've worked with it professionally I don't find my colleagues typically have the skillset (not the fault of the tech).

GKE is the container/kubernetes engine, but then you also mention Prometheus, Sysdig, Grafana, ELK.

Sounds like much of the problem was the monitoring stack, just curious why you blame that on containers and k8s? Wouldn't you still have needed a solution for that for 20-30 hosts regardless of how you're orchestrating/running the applications?

> just curious why you blame that on containers and k8s?

Crawl, walk, run sorta stuff. We had never just gotten the application/monitoring/everything humming on pure Linux hosts skipping that entirely because "K8s and containers!" When you haven't properly QA'd, vetted, whatever your stack throwing heavy abstraction at it (containers/K8s) is an anti-pattern.

Most companies don't have the resources to run a competent K8s distributed compute infrastructure and as a hiring manager (as much as an IC) I know I have to hire very specific, very expensive people for that role. Good ops folks come with experience in their realm and the newer the tech stack the harder it is to find competent help due to talent market conditions.

I don't blame containers and K8s - I blame the people, and I blame companies/teams for jumping at new tech that often doesn't have a justifiable use-case outside of "we're doing the popular thing!" vs. really considering what the needs of the solution are.

I also have a very low tolerance for downtime, and with those huge abstractions I find stuff gets missed more often, leading to my application being down for my users. I am a KISS engineer.

Fair enough, I definitely can relate to that line of problems.

"Should we spend the time to do a thorough look at our monitoring needs, figure out where the gaps are, be more disciplined about using the tool consistently, etc.?"

"That'll take too long, I heard about this shiny new ops tool that claims to require zero configuration, let's just drop this in instead!"

"Containers aren't quite there yet."

The rest of the world beg to differ, it's not a question of is it ready or not, it's "Am I going to use it or not".

We're way passed that question.

IMHO we are in a phase of enthusiasm, but we can already anticipate the peak and the trough of disillusionment will come.

In 10 years, the pendulum will ice swung a bit back and forth and we‘ll know better what works well. I bet it’s some form of lambda architecture.

Let me say that I am not pro or against docker per se. I just happen to have started my career with a strong team pre-docker and a lot of the docker-enthusiasm isn’t all that much addressing what was lacking in the operations space pre-docker.

> I bet it’s some form of lambda architecture.

Well: https://aws.amazon.com/blogs/aws/new-for-aws-lambda-containe...

In 2000, I had HP-UX Virtual Vault (aka containers) and CGIs (aka lambda functions).

Everything old is new again.

I guess call me new fashioned, but I've never really understood how to use debian packages well. I recall vaguely looking into the dpkg and build commands many years ago, it felt kind of inscrutable and clunky and I didn't find good resources that made it easy to learn so I just gave up on it and kept using the shell script to install the thing I needed with its dependencies.

By contrast, docker build and docker run are super simple to get started with (at least at a high level, figuring out the right order of flags and options to mount volumes and expose ports can get a little cumbersome). And the docker registry is super simple to browse. It's so easy to get up and running with, and the model it promises of isolation and self-contained dependencies makes a lot of sense which I think is why it has taken off so much. Despite the fact, which I think is what you're pointing out, that there often end up being a lot of pitfalls lurking just around the corner.

I think this too. Debian packages I'm sure are powerful but I never sat down and RTFM yet, which I think is a solid requirement. Docker is easier to understand and I picked it up pretty easily just running through a quick tutorial.

Definitely pros/cons to both depending on your situation. I can imagine debian packages being more useful in large scale multi-developer environments.

A deb file is just an ar archive of 2 tgzs, one with metadata and one with the actual files.

Now, the tools that build them are a whole different kettle of fish.

Plus some stuff is really old school, ar is an archive format that nobody has used on its own since 1996, it's a sort of transparent bundler/archiver à la tar, I think it was originally used to bundle .so file into a bigger package but still have the symbols inside visible. I don't really remember all the details, it's been a while since I looked into .deb packages.

I think the main problem is that Debian is not a commercial project and it shows sometimes. The tooling is kind of "hidden" (you have to poke around the distribution, mailing lists, etc. to figure things out) and the processes are kind of the same thing. The docs on the site are ok in some regards but they're far from complete and up-to-date.

Meanwhile the Dockerfile format is reasonably well documented and the tooling is also quite straightforward. You can see that a company made it for a while its raison d'être and wanted to make it easy to use.

I'm totally the opposite (old fashioned?), if I want to play with some library I just 'fire up' an rpm so I don't have untracked files spread around /usr/lib.

Having a multi-megabyte docker container to run some random program just seems...wasteful.

It's a mistake to assume that just because you're using docker, you now shouldn't bother understanding how (debian's) packaging works.

You're usually still installing packages and working with a debian/ubuntu/whatever distribution, except they're in a container now.

Docker is an additional abstraction layer one should understand, not a replacement for an existing one.

If you are used to dealing with debian packaging and running stuff directly on your server, using docker can feel like taking a step forward, but also two backwards. A few things are better, but a bunch are worse.

How do you handle multiple versions of the same project/software/deployment on the same machine?
You can for example have many SQL databases on the same db server, or many web sites on a www server. So you solve it with configuration. If you need different kernels you have to use virtual servers anyway.
The simple answer is you can choose not to. In many ways, a VM is a better abstraction than a container due to the simplicity of virtualising the hardware interface, as opposed to creating another abstraction layer in the kernel dealing with process isolation, permissions and system controls.
On the other hand, VMs are wasteful resource-wise (and $$$-wise) and have a much larger operational overhead (suddenly for every deployment you have a different Linux installation, with its own root /, with its own configuration drift, which you have to manage separately via CM).
To be fair, containers often end up being its own Linux installation with its own configuration drift. So many dockerfiles mindlessly pull in an entire Ubuntu system just to run a simple app.
But the image [1], once built, is still idempotent. You can deploy it and it will always contain the same configuration and code.

Meanwhile, a month-long Ubuntu VM that has received regular CM pushes (including system updates) will likely vastly differ from a branch new Ubuntu VM and a single CM push. To the point, where you can't be sure anymore that your current CM config will even work on a brand new machine, unless you're regularly testing that.

[1] - Yes, Dockerfiles do not make for reproducible builds - but once an OCI image is built, its deployment going to be reproducible. And there's more ways to build images than via Dockerfiles - some of which solve this problem (using Nix or Bazel, for example).

> But the image [1], once built, is still idempotent. You can deploy it and it will always contain the same configuration and code.

VMs can be idempotent too. It's just that traditionally people attach storage to it. But VM snapshots are a thing.

> To the point, where you can't be sure anymore that your current CM config will even work on a brand new machine, unless you're regularly testing that.

The same can be argued about attached storage to a container.

By idempotent do you mean immutable?
That same issues exists with docker containers. You can also build a pipeline to deploy very barebones VMs that contain the kernel, a barebones userland and the application. Use KSM to minimise memory usage. What you get with containers is a shared page cache and reduced context switching.

Once upon a time in tech, the thinking was hardware is cheap, technical staff is expensive, hence we moved on to systems and programming languages that saved us time at the expense of efficiency on the hardware.

20 years on, the cost equation hasn't changed. In fact, its probably shifted drastically towards the extremes. We'd likely save more energy by eliminating crypto mining than moving all VMs onto containers.

On my development laptop: a mix of VMs, docker containers, language/package managers. It's a per project choice, either mandated by my customers or advised by me. To name a few technologies I'm using right now in three different projects open in different virtual desktops:

docker

vagrant

VirtualBox (even some scripts to mimic EC2's spawning of machines with VBoxManage)

asdf (really, my fingers didn't slip on the keyboard)

npm

rvm

python's virtualenvs

I never had the need to do that and I'm not sure in what kind of situation I would.

For updates I just install the new version of the software, then perform a restart (new version starts, once it's ready, old version stops).

each version of the project gets a directory and contains everything the project needs. put all of those into a parent directory. use a symlink to point to the currently active version. E.g.:

    /usr/local/thing/versions/thing-v1.3.7
    /usr/local/thing/versions/thing-v1.4.2
    /usr/local/thing/current -> /usr/local/thing/versions/thing-v1.3.7
I mean, sure, I've done this too with a handful of scripts. But is this something you do via .debs? I'm asking specifically about how to handle this with plain Debian/Ubuntu packaging.
I do it similarly. I run multiple versions of a service at once with systemd service files. It gives me the same stuff as containers - cgroups, isolation, logging, service definitions and automation with ansible, but its easier on my feeble psyche.
Why would you want to do that?
For development, for example. Or for some kinds of shared hosting.
Isn't that a sign of technical debt? Not OP, but for development/testing: in a VM.
How is it technical debt? How else do you handle software rollout and rollback, or canarying? Do you have a VM for every single version of your software?
Uhhh... backups?! Not all companies release daily, weekly, and sometimes not even monthly. Stage the rollout, do your testing, get your evidence, get your plan, perform the release.
So if you deploy a new release which turns out to be buggy, your only recourse is doing a full backup restore?
No there are snapshots for that.
q3k, I can't reply to you at this depth. But yes. You're saying a "full backup / restore" but it's not the entire system.

Let's say you have an app, in a folder, that reads config files from 3 other locations on the machine. It talks to two databases. You back up two databases and 4 total folders. That's your backup. It's simple and straight forward to me.

(you have to wait until you can reply after a certain depth - this is HN's anti-flamewar system kicking in)

I understand you can restore from backups, but this doesn't seem simple to me - especially when you deal with situations where there's more than just one person deploying to production.

In comparison, my rollbacks are performed the same way rollouts/rollforwards are - by editing a single line in Git (ie. changing the OCI image string) and running `kubecfg update`. No need to access backups, no need for special procedures.

I think it’s flawed to think that you can safely have multiple versions of the same app running (or even installed) simultaneously in the same “universe.” Whether you use jails, VMs, containers, or whatever you should not count on “I didn’t change anything between these two versions that would corrupt the other instance” to help you.
What? Deploying new software is technical debt?
In a real sense, yes. Once it's deployed, it incurs costs, like a debt.

Reminds me of:

> My point today is that, if we wish to count lines of code, we should not regard them as “lines produced” but as “lines spent”: the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger.

-- Edsger W. Dijkstra

I make heavy use of both package managers and containers at my job, they solve different problems. Just because you don't have the problem that containers solve doesn't mean they aren't there yet