Hacker News new | ask | show | jobs
by peterthehacker 1706 days ago
> Hub disappearing would be a 100 times worse than the left pad incident in the npm world

This is really overdramatic. If Docker Inc. went out of business and Docker Hub was shutdown then the void would be filled very quickly. Many cloud providers would step in with new registries. Also, swapping in a new registry for your base images is really easy. Not to mention the tons of lead time you’d get before docker hub goes down to swap them. Maybe they’d even fix https://github.com/moby/moby/issues/33069 on their way out, so we can just swap out the default registry in the config and be done with it.

2 comments

> Also, swapping in a new registry for your base images is really easy.

This is the exact problem! Sure, MySQL, PHP, JDK, Alpine and other images would probably be made available, but what about the other images that you might rely on, but the developers of which might simply no longer care about them or might not have the free time to reupload them to a new place.

Sure, you should be able to build your own from the source and maintain them, but in practice there are plenty of cases when non-public-facing tools don't need updates and are good for the one thing that you use them for. Not everyone has the time or resources to familiarize themselves with the inner workings of everything that's in their stack, especially when they have social circumstances to deal with, like business goals to be met.

In part, that's why I suggest that everyone get a copy of JFrog Artifactory or a similar solution and use it as a caching proxy in front of Docker Hub or any other registry. That's also what you should be doing in the first place, to also avoid the Docker Hub rate limits and speed up your builds, not downloading everything from the internet every time.

Otherwise it's like saying that if your Google cloud storage account gets banned, you can just use Microsoft's offering, while it's the actual data that was lost that's the problem - everything from your Master's thesis, to pictures of you and your parents. Perhaps that's a pretty good analogy, because the reality is that most people don't or simply can't follow the 3-2-1 rule of backups either.

The recent Facebook outage cost millions in losses. Imagine something like that for CI/CD pipelines - a huge number of industry companies would not be able to deliver value, work everywhere grinding to a half, shareholders wouldn't be pleased.

Of course, whether we as a society should care about that is another matter entirely.

Using an abandoned image that nobody cares to update carries its own set of problems (e.g security)
As i said, if it's not exposed to the outside world and doesn't work with untrusted data, that claim is not entirely valid.

Imagine something like this getting abandoned, or someone running a year old version of it: https://github.com/crazy-max/swarm-cronjob/blob/master/READM...

Its only job is to run containers on a particular schedule, no more no less. There are very few attack vectors for something like that, considering that it doesn't talk to the outside world, nor processes any user input data.

Then again, it's not my job to pass judgement on situations like that, merely acknowledge that they exist and therefore the consequences of those suddenly breaking cannot be ignored.

If you depend on it, you should keep a local copy around that you can host if needed.

Things get abandoned all the time. When you make them part of your stack, you now are forever indebted to keeping them alive yourself until the point in which you free yourself from that burden.

If only we could have a truly distributed system for storing content addressed blobs ... perhaps using IPFS for docker images. This way you could swap the hosting provider without having to update the image references
I’d love for others with more knowledgeable to chime in, since this feels close to the logical end state for non-user-facing distribution. At a protocol level, content basically becomes a combination of a hash/digest and one or more canonical sources/hubs. This allow any intermediaries to cache or serve the content to reduce bandwidth/increase locality, and could have many different implementations for different environments to take advantage of local networks as well as public networks in a similar fashion as recursive DNS resolvers. In this fashion you could transparently cache at a host level as well as eg your local cloud provider to reduce latency/bw.
Sounds a lot like BitTorrent.
I’m not super well versed, but I thought BitTorrent’s main contribution was essentially the chunking and distributed hash table. There is perhaps a hood analog of the different layers of a docker image.
Isn't this what magnet links for torrent files have provided for years? Maybe even a decade? https://en.wikipedia.org/wiki/Magnet_URI_scheme