Hacker News new | ask | show | jobs
by gilgad13 1592 days ago
I agree with many of the concerns the author raises, but I'm left with the question:

Given all this, what does layering give us?

It gives some deduplication, but only a crude form. It gives some reproducibility from building off a well-known base and tag, but not full reproducibility. It gives some security benefit from building off a well-known base, but not as large a benefit as standard package managers provide.

I would be excited to see a image distribution system based off of something like casync, maybe with an initial rootfs formed through image-focused distributions like yocto[1]. The embedded device ecosystem has been concerned with reproducibility, image signing, and incremental updates for awhile and I think their approaches are very applicable to container images.

[1]: https://www.yoctoproject.org/

1 comments

Apparently, not a whole lot for image transfer and portability. But layering still gives you something at runtime if a single organization is using the same base image for all of its own containers. And, in practice, I think layer-level deduplication does still save on transfer costs. I'm not sure if this author just wasn't considering or realizing the state of where industry was heading, but with projects that are rebuilt on every commit, the change frequency of upper layers is still a lot greater than the rate of change on distro base images. They may be patched daily and you need to re-download the whole thing every day, but if you're building 40 times a day, that's still better than downloading 40 times a day. It's just a lot worse than we could be doing if we could only download diffs instead of the entire layer when a single bit changes.

It would be nice to see what, if anything, ever came of the ending tease. Something like git but also for binary files is what is called for. Arguably, ClearCase offered this exact feature 27 years ago, but being proprietary and expensive limited its adoption among modern web tooling.

You can have a "layer" build system using a snapshot-style approach. The fact that Docker files and build scripts are written in a layered manner doesn't mean that our storage format needs to be using layered tar archives that duplicate data needlessly.

As for the tease, sorry about that -- there were several discussions in the OCI community in relation to my proposals and other issues we might want to fix but sadly work has stalled.