There are tools for doing this, but it's a matter of cost and complexity to deal with them.
Artifactory seems to have a pretty big chunk of this vertical. It supports a few different repository protocols, so it serves as a bit of a one-stop shop that survives technology changes.
Way more than “kinda”. If you have a continuous integration pipeline that checks out projects from scratch (as it should), every build fetches all dependencies, transitively.
Even ignoring download costs, a local cache (one of the functions of an artifactory) helps speed up those downloads and,with it, your builds. It probably also helps against getting blacklisted by code repositories.
An artifactory also automatically backs up any libraries you use. That protects against them disappearing from the internet.
I think the first wave of artifactory customers was also populated by companies with limited network connectivity. It’s nuts to run a Rails or J2EE project if your company is using a pair of 1MB modems for all traffic, even if the dependencies are relatively small. Branch offices are similarly hamstrung. That was part of Perforce’s customer base as well, since they could run a local proxy for source code.
As you get into CI/CD you start to notice that your upstream repo is occasionally down, because it’s getting in the way of some deadline.
We use Nexus and cache all of our packages, but it is one more system to maintain and update. Sure Nexus is a great asset, almost never gave us trouble.
AWS has a managed service CodeArtifact that supports all the common code package repos and allows caching of upstream repos. Granted it doesn't work with Docker Images, but you asked about packages.
Someone would have to support it 24x7 and we could never get the uptime of DockerHub/ACS/ECS. Since a Production k8s deployment could spin up an instance at any time of day, some type of 5-9 or at least 4-9 uptime is pretty important.
1) set up a series of N docker registry mirrors in pull-through mode (https://docs.docker.com/registry/recipes/mirror/, it's as simple as "docker run --rm --name registry -d -p 5000:5000 -e REGISTRY_STORAGE_DELETE_ENABLED=true -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io -v /mnt/persistentdata/registry:/var/lib/registry registry")
2) expose them on the same domain name (multiple A records, loadbalancers, whatever you want)
3) set them as mirrors in each machine's docker daemon
In case one of your mirrors go down, take them out of the DNS/LB rotation. That's it.
No they do not, that is the entire beauty of the pull-through mirror. For user code, as long as they keep referencing only to Dockerhub images, nothing needs to be changed (edit: except Gitlab CI configurations using docker:dind, which needs to be informed about the mirror).
The only downside is, as said, that it can't cache third party repos (quay.io comes to mind for people involved in k8s). For these, one has to mess with the resolv.conf and self-signed HTTPS certs for the Docker registry mirror.
Nothing. We did this when NPM was having issues and it worked very well for us, we also did this for some non-US team mates who had very poor NPM performance.
It runs well, is easy to keep up and working and generally was awesome.
Artifactory seems to have a pretty big chunk of this vertical. It supports a few different repository protocols, so it serves as a bit of a one-stop shop that survives technology changes.