Hacker News new | ask | show | jobs
by shalabhc 1204 days ago
(author here)

There are two sides to the problem: the build and the deploy.

> it feels like just having a build machine with storage would immediately solve the problem, there would be no remote pulling. Is that not a solution?

Indeed, ephemeral builders are an issue and this would be an alternative solution. However doing this for an arbitrary numbers of users who want on-demand deploys for a variety of projects is non trivial. Most CI environments give you a "clean environment" at startup.

Even if the build takes 0 seconds, the other part is re-running this docker image (where 100s of MB is unchanged and say the final 1MB layer changed). You need to ship all these layers to a provisioned container and boot it up. Even this is possible to optimize by heavy caching and optimizing the container service - but that's an alternative solution with different trade-offs. With pex, we can update any docker image in-place so it is container service agnostic. It also works with our current service (Fargate) which has famously slow startup times.

2 comments

> Most CI environments give you a "clean environment" at startup.

Yeah, it was partly that which made me wonder if just removing that would help a lot. Seems weird to keep a clean fresh CI setup and not have that in production

I understand handing off the scaling issues to someone else. Perhaps this is the difference between the "best outcome" and "best solution" as the latter takes into account what already exists.

K8s does some optimisation on image layers but my knowledge there is fuzzy.

Perhaps this is the best solution given the constraints, but it's a shame that the fully supported layering system in docker needs to be recreated essentially due to two services throwing away lots of useful data. A container runner that stops your container, pulls the layers and restarts it would only have the container startup time itself - which is extremely small - and keep everything separated.

I agree with your sentiment.

K8s may have some more controls for the "incremental deployment" case but I'm less confident about the isolation between pods to run user provided code.

>(Fargate) which has famously slow startup times

Haven't tested it in a couple years but start time used to correlate with image size (which makes sense). I think the Fargate VMs don't have very many IOPs so you're usually waiting on extracting tarballs for your container to start

With that in mind, it can sometimes be more efficient to ignore Docker later best practices and just try to smash everything down into the smallest single layer possible