Hacker News new | ask | show | jobs
by alberteinstein 3059 days ago
When a container is scheduled to a node in Kubernetes, if the image is already available, it takes only ~2s for it to be up and running. But, if it is not present on a node and the image is of 2GB, the download speed is the rate limiting factor, taking start up times well beyond 2 seconds. And in a multi node environment where you do scaling up and down or maybe autoscale, new nodes could come up a lot. So, Leaner images are always better.
1 comments

> So, Leaner images are always better.

That's a very vague statement, though, given the layer concept. The same final filesystem image can use any number of layers to get there. The way you organize the layers in your images, and share data between multiple apps, is important on top of generally keeping things reasonably small.

What I think you're trying to say is that fast startup is preferable. What you have not addressed is average startup time, which is where you still benefit dramatically by understanding layers.

The smallest average amortized image size is best. That means if you can get all the services for a new node started with 2GB total download, it's better than 3GB total download.

Using layers appropriately, and perhaps adding dependencies used by 80% services to your base image, may be best for overall efficiency. Know the tool and know your usage patterns.

This adds an un-necessary dependency between services. Team working on service A would need to coordinate with team working on service B to keep same layers. The contract should be to keep the image size as small as possible, and teams can work independently.

Also, python:3.6 doesn't always mean the same layers, since it could be rebuilt with entirely different layers with same tag, unless FROM is locked to a particular layer.