Hacker News new | ask | show | jobs
by wmf 4210 days ago
It seems like it ought to be easier to document Docker than to rewrite it from scratch, but maybe I'm wrong.

Likewise if Docker is already one big binary it shouldn't be impossible to create a daemonless mode (Doxorcism?).

4 comments

Docker is not static. If much of it is implementation-defined, and implementation is evolving, then documenting a particular version of Docker is as valuable as I'd be willing to stay locked down to that particular version. And the APIs are not stable: just look at https://github.com/jenkinsci/docker-plugin/issues/115 to see how a Docker upgrade can break a project that uses its APIs (and if the plugin used CLI rather than HTTP API, it would still break: the cause was a change in formatting information that was not available anywhere in a parseable form, and as far as I know is still available only as "pretty text" to be parsed by a regexp)
Yes, documentation would need to be an ongoing process combined with a policy that code changes need to also update the relevant documentation. This may make development slower but revolutions aren't exactly efficient either (or I've just been playing too much Unity).
What good is of updated documentation if change breaks my code? The good way would be to first define and stabilize interfaces, and work from there - but this would require stopping for a while to think, and refrain from adding new features in the meantime, which doesn't seem likely with Docker.

And while I don't support "revolution" (or even think that Rocket is necessarily better than Docker, or that Better One Should Win), I really appreciate break from monoculture. Having one large player in any area is harmful. Even if Rocket stays a niche project, its existence on the market will influence Docker's strategy.

Is there a particular feature you wish Docker had not added? 1.4 RC had 2 months worth of stabilization and basically no new features.
It's less about particular features (none of these is individually harmful, but I personally find the monolithic architecture a bit of an issue), and more about taking time to actually define stable and complete APIs and boundaries. Stabilization (from technical reliability POV) is all good and fine, but - for instance - is there a way to easily find out containers' names and links, and exact forwarded ports, from the API without regexping `docker ps` output and "tcp/1234"-like strings that should be meant for human eyes only? Is it still Docker's official position that CLI is only official interface, and using HTTP API is frowned upon? Is all information about a container available in `docker inspect` or API equivalent, or do I still need to parse `docker ps` to find out about names and links? Does the registry/index mess ever going to go away, did it even make sense (other than vendor lock-in) at some point? Since Docker and registry API is HTTP, why can't I use http auth (even implemented by an nignx proxy) instead of certs (which are fairly new) or centralized index for authentication? It should be as simple as using `https://user:pass@IP/…` URLs, what's the problem with it? While we're at it, why on Earth is HTTP Docker endpoint specified as `tcp://IP:PORT` rather than `https://IP:PORT`, which would make it possible and easy to proxy, use http auth in an intuitive way, and generally reflect how it works rather than obscure it? Should I go on?
> (none of these is individually harmful, but I personally find the monolithic architecture a bit of an issue)

The architecture is not monolithic. I want to break down Docker into the most discrete and composable parts possible. But I want to do that without hurting the user experience. If you can point out a way to do that, I will implement it. And if you are available to help, it will happen faster.

> is there a way to easily find out containers' names and links, and exact forwarded ports, from the API without regexping `docker ps` output and "tcp/1234"-like strings that should be meant for human eyes only?

You are not expected to parse 'docker ps' output programatically. That is not and has never been the recommended way to interact with Docker programatically.

Yes, the current API expects clients to parse strings like tcp/1234. A string of the form [PROTO/]PORT seemed like a pretty reasonable thing to parse. But if you would prefer a json object like {"proto":"tcp", "port":1234}, I don't have a fundamental problem with that. Feel free to open a github issue to suggest it. In general we frown upon randomly breaking things but if it improves the situation for clients we will consider it. It could also make for a nice first patch if you're interested in that :)

> * Is it still Docker's official position that CLI is only official interface, and using HTTP API is frowned upon?*

That was never true. I'm not sure how you got that idea. The HTTP API has existed forever, the official client uses it for 100% of its functionality, and I definitely recommend using that as the primary mode of interaction with Docker. I acknowledge that the API itself is not perfect, and will welcome any suggestions for improving it (we are discussing quite a few improvements already on #docker-dev).

> Since Docker and registry API is HTTP, why can't I use http auth (even implemented by an nignx proxy) instead of certs (which are fairly new) or centralized index for authentication?

You can use all these things with the registry API. The centralized index is completely optional for authentication.

I'm curious where you got all these incorrect notions? If you read it somewhere in the docs, then it's a documentation bug and I would appreciate it if you could point it out so we can fix it.

> It should be as simple as using `https://user:pass@IP/…` URLs, what's the problem with it?

I don't understand that part. I believe Docker registry auth today uses basic auth over TLS by default, which has security issues (ie storing your password in clear in your home directory). We are working to add a more secure token-based auth. But you should always be able to use basic auth if you prefer that.

> While we're at it, why on Earth is HTTP Docker endpoint specified as `tcp://IP:PORT` rather than `https://IP:PORT`, which would make it possible and easy to proxy, use http auth in an intuitive way, and generally reflect how it works rather than obscure it?

I don't understand your question here either. Are you saying you would like the format of the command-line option docker -H to start with https://* instead of tcp://? If so, I think that's a good idea, but it also seems like a ridiculously small and cosmetic problem. I also don't understand how it affects http proxying or auth in any way. Regardless of the command-line flag, the API actually uses https*. So auth and proxying should work completely as expected (with the exception of `attach` which drops to the raw TCP socket to allow bi-directional streams. But we have a websockets endpoint for that and I would really like to deprecate the "drop to tcp" mode soon.

BTW, it's interesting that you chose to reply to "refrain from adding new features" part (which may have been unnecessary snark on my side), rather than to "stopping for a while to think" part (which is the actual point)…
It seemed like the same question me.

On "stopping for a while to think", I suppose it's subjective. From my point of view we spend insane amounts of time thinking about the right design for all these things. At the same time we receive approximately 50 pull requests a week, and a lot of them are for small incremental improvements which are very reasonable and could be merged very easily with little effort. So, there is a balance between "feature all the things!" and "sorry we're not merging anything for 3 months while we rethink everything from scratch". Every large scale open-source project deals with that tension. We definitely have room for improvement. But we care very much about good, minimal design, and not breaking APIs.

Documentation is not the primary problem with Docker that Rocket is trying to solve. The issue is Docker's process model, which is fundamental to the way it is built. Docker seems to have taken the opinion that the best way to solve the problems with their process model is not to fix it, but add more features to the kernel to force it to work. Rocket went with the fix it approach and simply started over.

Also, would you mind clarifying what you mean by "daemonless mode"?

By daemonless I mean making "docker run" be able to directly fork a container without needing dockerd. Like how "rkt run" works.
What's interesing is that Docker used to run (almost) this way some time ago. I guess the daemonless mode won't fly with managing communication (iptables NAT, IP assignment, linked containers, etc).
Yeah we used to call it "standalone mode", it was pretty neat but really confused users (it would auto-detect whether to go daemon mode or not, so behavior became contextual and hard to troubleshoot).

Yes, managing communication or other centralized resources is harder that way which is a challenge for "embedded Docker". Rocket does not have this problem because it relies on systemd to manage all this centrally under the hood. So you get "daemonless" as long as you sweep everything under the giant systemd rug :)

It doesn't need to be systemd, and that's the beauty of Rocket's design: it is composed of individual, well defined pieces with precise boundaries and areas of responsibility. As I wrote, my main practical concern now is to get something container-like working on FreeBSD. Docker is useless here unless it's wholly ported; Rocket is useful even if I don't use its code and go with (even partial) implementation of the spec.

From my practical POV, my options now are: port Docker (NOPE), reimplement Docker (NOPE GOD WHY), port Rocket (Maybe?), reuse spec and pieces of Rocket's code in my own opinionated NIH plumbing (Hell yeah, somebody did the thinking part for me! The spec is usable!), or write own opinionated NIH plumbing from scratch (why would I if there's a decent spec to lean on?).

This is truly why Rocket is so interesting. The App Container Spec is what is missing from Docker. They had their chance to implement a common universal spec which others could implement independently, but didn't do so. Now CoreOS is picking up the ball and running. I don't see any option for Docker other than to work with CoreOS to form this universal specification, or to adhere to the specification after it's been finalized.
Often people want to run without dockerd because they want to use a different orchestration system (systemd, OpenStack, etc.) and usually that system sets up the networking and such.
Yes, what I call "embeddable Docker" is a desirable thing. It will help Docker integrate with other centralized daemons that want to own the process tree, like systemd. I've made it very clear in the past that I want this too, but would welcome help to implement it. I'm actually putting together a "hacking sprint" so that the various interested parties can participate in making it happen. I actually offered the CoreOS guys to join the effort last week - but they seem determined to do it themselves instead. That's OK but they should know they are welcome back if they change their minds.

So, anybody want to help hack on embedded Docker?

See also this good, unbiased explanation of the relationship between Docker, CoreOS and systemd: http://www.ibuildthecloud.com/blog/2014/12/03/is-docker-fund...

The next systemd hackfest is on Jan 30th:

  https://plus.google.com/events/c56kbn26s6g01n6m4tj2nmdgnfc
If someone could fix docker so that projects like this are not required:

  https://github.com/ibuildthecloud/systemd-docker
...that would be great!
LOL. Doxorcism is pure genius. :)