Is there any performance benefit to having fewer layers? My understanding is that there's no gain by merging layers as the size of the image remains constant.
There are some useful cases — for example, if you're taking a rather bloated image as a base and trimming it down with `rm` commands, those will be saved as differential layers, which will not reduce the size of the final image in the slightest. Only merging will actually "register" these deletions.
Less performance and more security. Lots of ameteur images use a secret file or inadvertently store a secret to a layer without realizing an rm or other process in another layer doesn't actually eliminate it. If the final step of your build squashes the filesystem flat again you can remove a lot of potentially exposed metadata and secrets stored in intermediate layers
Eventually, once zstd support gets fully supported, and tiny gzip compression windows are not a limitation, then compressing a full layer would almost certainly have a better ratio over several smaller layers
If you've got a 50 layer image then each time you open a file, I believe the kernel has to look for that file in all 50 layers before it can fail with ENOENT.
It depends on your OCI engine; but this isn’t the case with containers. Each layer is successively “unpacked” upon a “snapshot”, from which containers are created.
A container runtime could optimize for speed by unpacking all those layers one by one into a single lower directory for the container to use; but at the cost of using lots of disk space, since those layers would no longer be shared between different containers.
In practice I've found the performance savings often goes the other way--for large (multi-GB) images it's faster to split it up into more layers that it can download in parallel from the registry. It won't parallelize the download of a single layer and in EC2+ECR you won't get particularly good throughput with a single layer.
Depends. If you would have to fetch a big layer often because of updates, that's not good. But if what is changing frequently is in a smaller layer, it will be more favorable