Hacker News new | ask | show | jobs
by sveiss 1432 days ago
It looks like the approach this takes is to use instrumentation to record which files in the container are used at runtime when run under a test harness, and then build a new image omitting any files/packages that weren't used during the instrumented run.

I think there are several serious problems with this approach.

First, I would be very wary about trusting an image modified like this in production: it would be very hard to be certain I've exercised every code path I care about--including rarely hit error paths--when running the instrumented build. Perhaps a localization file with error messages is only loaded when an error condition is hit, and removing that file converts a non-fatal logged error into a fatal file-not-found?

Removing files also makes it very easy for your CVE scanner to report false negatives. For example, running docker scan on bitnami/redis returns a long scary list, including:

   Low severity vulnerability found in coreutils/coreutils
    Description: Race Condition
    Info: https://snyk.io/vuln/SNYK-DEBIAN11-COREUTILS-527269
    Introduced through: coreutils/coreutils@8.32-4+b1
    From: coreutils/coreutils@8.32-4+b1
The docker scan output on rapidfort/redis is empty, so great, we have no vulnerabilities, right?

   Tested rapidfort/redis for known vulnerabilities, no vulnerable paths found.
This particular CVE is present in chown and chgrp, according to Synk's info link. The same version of chown that sync thinks is vulnerable in bitnami/redis is also present in the rapidfort image:

   docker run --entrypoint=/bin/chown rapidfort/redis --version
  chown (GNU coreutils) 8.32

   docker run --entrypoint=/bin/chown bitnami/redis --version
  chown (GNU coreutils) 8.32
In this particular case, it looks like the original "vulnerability" is a false positive, but that doesn't change the wider point -- by trying to clean up an image by removing files, it's really easy to remove whatever signatures a CVE scanner is looking for without actually removing the vulnerable code. Here, it looks like you removed /var/lib/dpkg/info/coreutils*, so Synk doesn't think coreutils is installed, but some of the binaries are still present.

In my mind, false negatives are far scarier than false positives.

Finally, publishing an image modified like this without further cleanup is being a poor community participant.

For example, Redis is distributed under the 3-clause BSD license, requiring the license conditions to be distributed alongside any binary distribution. Your image removes all of the license files, and your Dockerhub page simply says "free to use and has no license limitations". You're quite likely violating Redis' license, and that of other software still present in the image.

You've also left Bitnami's welcome banner in place:

   docker run rapidfort/redis
  redis 12:06:41.40
  redis 12:06:41.43 Welcome to the Bitnami redis container
  redis 12:06:41.44 Subscribe to project updates by watching https://github.com/bitnami/containers
  redis 12:06:41.46 Submit issues and feature requests at https://github.com/bitnami/containers/issues
If a user does encounter an issue with your modified image, you're directing the support burden to Bitnami, who will then have to spend time in triage determining that a modified image was in use and that their code may not actually be at fault.

I think trying to reduce attack surface by removing unnecessary parts of an image is a noble goal, but I don't think a mostly-automated approach is a safe way to do so. I would much prefer to see the output of the instrumented run being used by a human to guide slimming down a Dockerfile manually, which would produce safer images without the risks of automated post-processing.

6 comments

I have taken care of welcome banner. Thanks for pointing this out.

  docker run rapidfort/redis
  redis 01:21:19.08
  redis 01:21:19.08 Welcome to the RapidFort optimized, hardened image for Bitnami redis container
  redis 01:21:19.08 Subscribe to project updates by watching https://github.com/rapidfort/community-images
  redis 01:21:19.09 Submit issues and feature requests at https://github.com/rapidfort/community-images/issues/new/choose
Missing licenses was a bug in my scripts. I have updated the container images to include all licenses. Thanks for pointing this out.

I have also included metadata in the container. This will allow any scanner to generate the scan report from any tool.

I have also added links to the scan report on the docker hub, so it's easy to see what all CVEs still exist in the docker image. There is no intention to mislead anyone in the community.

RapidFort system, which I am using in the open source community images, allows the user to select files and integrate them into Ci/CD manually. The community images project on GitHub uses Github actions Ci/Cd to achieve the same manual curation.
You can direct the hardening process based on the instrumentation data, then bake it into your CI/CD and automate it. That’s how the community images are produced.
The issue is beyond careful package selection at dockerfile level. Even after carefully building your images, you’ll pull in a good number of unused dependencies.
Thanks for pointing licensing issue. I will fix it, this is open source and I am happy to accept and contributions.
Regarding support, our open source project has issues page, I welcome community to add issues and we will prioritize.