Hacker News new | ask | show | jobs
by zdw 871 days ago
Are "Random shell scripts from the internet" categorically worse than "random docker images from the internet"?

With the shell script, you can literally read it in an editor to make sure it isn't doing anything that weird. A single pass through shellcheck would likely tell you if it's doing anything that is too weird/wrong in terms of structure.

Auditing a docker container is way more difficult/complex.

"Dockerize all the things", especially in cases when the prereqs aren't too weird, seems like it wastes space, and also is harder to maintain - if any of the included components has a security patch, it's rebuild the container time...

9 comments

If you want an example of how little importance vetting oci images is to most ops/infra teams I have a great example- I used to work on low level k8s multitenant networking stuff, think cdns. Most of them use something like multus to split up vfio paths between tenants. Think chopping your NIC into 24 private channels and each channel is one customer. The ENTIRE path has to be private, the container starts and claims that network path on the physical NIC. No network packet can ever be accessed by another channel, server or container. I was alpha-testing multus which controls this network pathing that every customer would take ingress and egress out of a cluster and put up some test containers on dockerhub.

Multus sits at the demarc line between the container and the NIC channel. I'm not saying it's possible or ever been done but if I were going to set up a traffic mirror somewhere it'd logically have to be there or after the NIC..

I wrote it 5 years ago. I have no idea what version of multus it's running but even today it's getting pulls, last pull 19 days ago. Overall pulls over 5 years is over 10k.

These containers would spin up every time a container starts on k8s that attaches an ovf interface. So, it's pretty much guaranteed that this is in use somewhere in someones scaling infra. I don't know if I SHOULD delete the image and potentially take down someones infra or just let them keep chugging at it. I'm not paying for dockerhub.

https://hub.docker.com/repository/docker/swozey/multus/gener...

edit: Looks like it's installing the latest multus package so not AS terrible but .. multus is not something to play loose with versioning..

Also I really wish Dockerhub gave you more stats/analytics. It really means nothing in the end but I'm curious. They don't even tell you the number beyond 10k, it just says 10k+ downloads.

https://github.com/k8snetworkplumbingwg/multus-cni

Something like this would show up in perimeter network/firewall logs correct? But if someone was mirroring traffic to the same cloud provider you deploy in, it would be less obvious to find out _which_ set of cloud IPs aren't actually your own.
assuming you have both perimeter logs and a system which notifies a human if something is weird in logs.

Do big clouds have a solution for this? I don't usually use GCP / AWS so I don't know what they have

> Auditing a docker container is way more difficult/complex.

I assume you mean auditing docker images. In which case, sure. That's why you grab their dockerfile and build it yourself.

Though using dive[1] it's pretty easy to inspect docker images too, as long as they extend a base image you trust.

[1] https://github.com/wagoodman/dive

> That's why you grab their dockerfile and build it yourself.

Then you still didn't audit anything. What you need to do is inspect the docker file, follow everything it pulls in and audit that, finally audit the script itself that the whole container gets built for in the first place. Whereas when you just download the script and run that directly, you only need to do the last step.

Yeah, people don't seem to actually care. The Bitnami images were quite popular, but looking inside it they all just pull random tarballs from their server, and nothing seemed to indicate where those things came from.
All of that is the same as a shell script, yes. A dockerfile is essentially just a glorified shell script installing dependencies, which you'd otherwise just be doing yourself.
>Then you still didn't audit anything. What you need to do is inspect the docker file, follow everything it pulls in and audit that

You don't need to audit anything it pulls in INSIDE the container. Who cares? Just what kind of access it gives the container to the host.

This sounds like fine a way to mine Bitcoin for someone else
The whole point is that you checked that the container gets no access to the network.

Not to mention why wouldn't you let a shell script container keep running?

You can use quotas to mitigate that risk, and monitoring to discover it. You'd be monitoring CPU usage anyway, whether or not you build your own images or write your own Dockerfiles.
oh dang, dive is really a nice tool, per layer diff and/or accumulated changes .. really nice
A script running in a container is mostly isolated from the host by default, so it can't just upload whatever SSH keys / Bitcoin wallets / other stuff you have lying around or add some payload on your ~/.bashrc unless you explicit share those files with the container.
This is true, but we are talking about running this script on some codebase (or whatever you want to "git undo"). I mean "I don't trust this script, but let's run it on our source code" sounds a bit weird.
I agree, in this case it's hard to defend against a rogue script or container image, as you need to give it read-write access to your source code, so it could add a malicious payload to your source code or install a Git hook to break out of the container into your host or get some malicious source code onto your company's Git server.

There are measures that could defend against this (run all your development tools inside containers, and mandatory PRs with reviews) but they are probably beyond many/most developers are willing to do security-wise.

There are a lot of scenarios where I think security through isolation/containerization makes a lot of sense (e.g. for code analysis tools, end-user applications like video games, browsers, etc.) but not too much for this particular one.

I’d be quite surprised to see a company not using code reviews. Nowadays I work with a pretty large CI/CD pipeline, but even when we were a small company we enforced code reviews on all changes.

I’ve seen people be a lot looser with code execution though.

Run a diff after running the script and it should bring up anything funny. Hopefully people won't be just running it and automatically committing and pushing without inspecting the results, right?
It could easily upload your source code, add a git hook that will run (out of the container) next time you commit, create .env or similar files that are git-ignored but automatically run by common tools, etc.
Yes, I understand https://xkcd.com/1200/ as well.

Running anything without understanding what it does it is more dangerous than trying to understand it before running it.

I'm arguing for less complexity and easier auditing, instead of a series of complex layers that each add to a security story, but make the overall result much harder to audit.

To move directionally in the way you describe, you probably have to make the user experience of running scripts of any kind much weirder. macOS does this to some extent by prompting via GUI if something tries to access data directories on your system (though it confuses iTerm2 for "anything iTerm2 runs" and that sucks), but I think people would have a lot more problems with trying to do that in a server shell.

To that end, Linux namespacing is probably a better way to constrain the blast radius for most people. That's not to say it should be an either-or, but in the absence of a both-and because the userland is not set up for sufficient policing, I think Docker containers are a pretty clearly better solution.

The dive utility helps tremendously for exploring the filesystem contents of a container image. Combine that with the output of `docker inspect` to look at the metadata and you should be able to have a good understanding of what it will do when running as a container.
Evaluating the whole contents of a filesystem is significantly more complex than evaluating one shell script.
>Auditing a docker container is way more difficult/complex.

As long as it doesn't have access to outside of the container, who cares?

You check the dockerfile, see what access it allows, and build the container.

Besides a shell script can be 100s of lines, not very fun auditing it.

We could probably create a java applet or flash application that runs in the browser safely too!

That was more snark than HN likes, but it feels like forgetting promises of the past in a dangerous way.

We do it everyday with Javascript in the browser, on, like, 10 orders of magnitude bigger frequency than we ever run Java Apples and Flash. The whole web commerce, banking, b2b, etc. depend on it. Imagine that, huh?

Is that enough snark?

Not to mention, if your problem is container breaking out, you have way way bigger problems that shell-script containers.

I think you nicely summed up why we have a huge problem with the current state of things, both in the NPM and Docker ecosystem.
One of these things is not like the other.

Also missed that my whole point was about CHECKING the container dockerfile - not running an off the net image as is.

Perhaps this lack of attention to context is why we have a huge problem with the current state of things, both in the NPM and Docker ecosystem.

Javascript has much smaller surface area to the system than a docker app. And we still find rowhammer type attacks.
> As long as it doesn't have access to outside of the container, who cares?

https://snyk.io/blog/cve-2024-21626-runc-process-cwd-contain...

Also I can't imagine a real world scenario in which we can safely ignore what happens inside the container. Really reminds me of https://xkcd.com/1200/.
Everyone running containers, particularly untrusted ones, should care because containers aren't a security tool and don't provide secure isolation.
> Are "Random shell scripts from the internet" categorically worse than "random docker images from the internet"?

> With the shell script, you can literally read it in an ...

... https://shellcheck.net.

Can't do that if all of the work is hidden in a Dockerfile's RUN statement. I commit shell scripts in shell script files, and the Dockerfile just runs that shell script. Then the shell script can be version controlled for purpose, static analyzed, and parsed in an IDE with a plugin supporting a shell script language server.

> Are "Random shell scripts from the internet" categorically worse than "random docker images from the internet"?

Yes, because inspection aside, at least with a docker invocation you can specify the volumes

https://github.com/containers/bubblewrap allows specifying volumes for scripts too
Docker is just a glorified cgroup plus wrappers. You can isolate any process like that, even a shell script.

chrooting the unknown script is being 90% there.

Does anyone in practical invocation specify the volumes?

Or would they wrap it in yet another shell script that calls docker with a set of options, or a compose file, etc?

This quickly turns into complexity stacked on complexity...

Yes I run:

sudo docker run -it -v (pwd):(pwd) my_dev_image

many times every day, to create a development enviromnent in CWD. My_dev_image is a debian-based image with common developer utilities (pip, npm, common packages installed). I don't feel comfortable installing random packages from the internet on my host machine, so I use docker for everything.

> Does anyone in practical invocation specify the volumes?

First: yes, I have run docker with -v recently.

Second:

> Or would they wrap it in yet another shell script that calls docker with a set of options, or a compose file, etc?

> This quickly turns into complexity stacked on complexity...

I agree that it can get out of hand, but a Dockerfile, a compose file, and whatever is going inside the container can be an entirely reasonable set of files to have so long as you stick with that and are reasonable about what goes in each. Where to put it differently, I think it's okay because they actually are separation of concerns.

I never use containers from the web unless they're created be the company or developer themselves. If they don't produce one then I build my own.
Reading the Dockerfile should tell you what was done to create the image. If you have trust issues around the "base" images such as Debian or Fedora that is a different set of inquiries.

As for patching, you can tell your Dockerfile to always pull the latest versions of the items you are most concerned about. At that point rebuilding the container is as simple as deleting it with "docker container stop <id> && docker container rm <id>" and then run your docker-compose command again.

Does anyone read/diff the build commands every time they get a new `latest` docker image?

There would already be implicit trust in whatever the local OS's package manager laid down, and trying to add another set of hard to audit binaries on top is not really an improvement.