Hacker News new | ask | show | jobs
by zaargy 3976 days ago
TL;DR It's too damn complicated if you're not Google/Twitter/Netflix. Most people would be fine just deploying OS packages and keeping their stacks as simple as possible.
7 comments

I still love Docker and do think it solves a genuine problem. But yes, where to put your logs, how to manage state, how to schedule containers on machines, how to coordinate processes, how to inspect an app when something goes wrong, how to measure performance, how to manage security, how to keep consistency across your docker containers... are all problems you need to solve from the get go with Docker and they are all non-trivial! Ain't nobody got time for that.
I'm not sure how much you know about docker, so to anyone in whom this list scares:

> Where to put logs

Well, I just throw them aside and use `docker logs [container]`

> How to manage state

One container should perform one service. I haven't run into a problem here.

> How to schedule containers

ECS :) But honestly, I subscribe to the approach that containers = services and thus should just always be running.

> How to inspect app

`docker exec -it [ container id ] bash` ("ssh" into container)

`docker logs`

`docker -f logs` (follow logs)

> How to measure performance

Probably same way you measure system performance

> How to manage security

Everything of mine is in a VPN; some services can talk to certain services over certain ports... Personally, I don't really understand all this talk about security. Protect your systems and that should protect your containers. Why is it that isolated processes are causing people to throw up their arms like security is an unimaginable in such a world? There are ways..

> Consistency across docker containers

This can be a pain if you need this, yea. They see to be adding better & better support to allow containers to talk to one another (and ONLY to one another).

> Ain't nobody got time for that.

Hmm, personally I don't have time to go thru what Puppet, Chef, and even Ansible require to get your systems coordinated. I see this as far more work than creating a system specification within a file and finding a way to run it on some system.

All comes down to requirements though and where your technical stack currently is at. To any newcomers who are also plowing into the uncertain fields of a dockerized stack, fear not! You are in good company and if I can make it work, you can too.

If this is your advice then you shouldn't give advice.

1) 'docker logs' relies on using the json logdriver which means the log file is stored in /var/lib/docker/..... and grows forever. No rollover. No trimming. FOREVER.

2) What if your container dies? What if your host dies? Do you have any state at all or have you abstracted that out? Are your systems distributed

3) Always running does not answer finding where to run them

4) That only works if the container is running. What if it died? Also, docker logs is a fool's game

5) bingo, that's right at least

6) ....

1 - Yes it does. This is a system problem more than a docker problem. For any relatively experienced engineer, they should be capable of realizing the logs must be stored somewhere and can plan around it.

2 - If a system dies and it has a state, then what do you do? If a dockerized process dies, and it has a state, then what do you do? This isn't some new problem to Docker. If my database service dies, you know what happens? It starts back up and connects to the persistent volume. Personally speaking, yes all of my services / systems are distributed.

3 - Most people don't need to start their services exactly at this point and then stop at another certain point (which is why I pretty much brushed over it). If they do, there's plenty of tools to do this that can also utilize docker.

4 - What if a system died? Does this mean you SSH'ing in isn't a viable option? (yes...)

5 - Yes, you love negativity so clearly this is your favorite

6 - ...? What? Do you have something more to say?

It's cute that you like to poke holes and personally attack people, but really my comment was just how I go about things on a day-to-day basis. This is coming from someone who has 6 major Docker services abstracted out running all the time across 3 environments.. all capable of being updated via a `git push`. I think I have decent, practical advice to offer for other docker-minded practitioners and just decent advice to newcomers.

Your grievances circle around logs not being centralized, easily-accessible (1, 2, 4).. You also don't outline any solutions yourself.

> For any relatively experienced engineer, they should be capable of realizing the logs must be stored somewhere and can plan around it.

Yes, the unrotated container logs are kept in a root-accessible-only location in a directory named after a long key that changes on every image restart - not conducive to manual log inspection, and definitely not conducive to centralised logging. That's not a 'system problem', it's Docker just being rude. Yes, a relatively experienced engineer can work around that... but why should they need to 'work around' it in the first place?

Ironic really, that if you put a user in the 'docker' group, that they can do anything they want with the docker process, destroying as much data as they like or spinning up containers like nobody's business... but they can't see the container logfiles.

Good point, thank you!
> 1) 'docker logs' relies on using the json logdriver which means the log file is stored in /var/lib/docker/..... and grows forever. No rollover. No trimming. FOREVER.

Even without that issue, I'd prefer my logs to be centralised. So as well as my app should I be running a logging daemon, process monitoring, etc for each docker instance?

What we do at work is that we have our containers be in charge of talking 'out' on a given address and format for logs, and have things configured so that entire sets of machines end up speaking with the same log server (an ELK stack, in our case). The process monitoring is done per host: There are docker-aware tools that look at the host, and can peer into the container, to do this basic tracking.

People are not kidding through, when they say that everything gets very complicated. All the things that we did by convention and manual configuration in regular VMs that are babysat manually have to be codified and automated.

Docker is going to be a great ecosystem in 3 years, when the entire ecosystem matures. Today, it's the wild west, and you should only venture forth if having a big team of programmers and system administrators dedicated just to work on automation doesn't seem like a big deal.

Similar to hibikir's reply, what we do is attach a volume container to all app containers and logs are written to that. The run elk stack to view parse logs. For process monitoring we run cAdvisor on each host to view the resource usage of each container. Since your apps are containerized it easy to monitor them for resource usage, hook it to nagios etc. We have built custom gui to do all this.
> If this is your advice then you shouldn't give advice.

Stop with the blaming statements.

Do you have a better way to say "your advice is bad. stop spreading misinformation"?
"There are some problems with your approach, and it could bite you in the tail when you least expect it. Here's what you should also know: ... (rest of GP's points)"

This avoids saying "You're an idiot", which is nearly never constructive or helpful, and instead makes education and cooperation its goal. Most people respond better to that.

There's no need to say that at all. Just address the points and let everyone form their own opinions.
Yeah, I do. Simply say what you find inaccurate about what they said and let everyone else come to their own conclusions. Saying "stop spreading misinformation" is basically speaking for others. Maybe we want to hear the misinformation because it's interesting or maybe you are wrong about it being informational or not. Either way, it's my choice to listen to him/her or not.
Well I used Docker on top of Mesos so I have quite a bit of experience and the above were all problems I faced. They're not impossible problems obviously but they take time and thought to solve. From your responses, I am not sure you fully appreciate the problem to be honest.

You have to remember that you containers and coming and going all the time, which is one of the biggest challenges. It basically means you have to have everything centralised and that means a lot of additional infrastructure/complexity.

Wouldn't it be fair to say these are all problems faced with any / all deployments? When are logs never going to be a problem or security or state? I appreciate the problems (honestly!), but when it's presented as docker-specific, I get confused. Yes.. all these things need to be managed.. at the end of the day you're changing your stack from running 5 systems to running 5 processes acting like 5 systems. This is going to take some thought, but I genuinely believe there's a greater reward at the end of the tunnel in this realm than there is the old world of puppet master / slave.
The problem(s) with "docker logs" is that, without getting logs out of docker you can't see multiple containers' logs interleaved, and without a separate logrotate setup they're not rotated (the files in /var/lib/docker/containers/ grow indefinitely).
If this is what it takes to get procrastinators onto a real logging stack I'm not sure I see it as a problem.
As far as logs go, you should just be doing network logging with Docker to either a syslog server or something like LogStash or Splunk. If you are big enough to have a genuine need for containers, you also are big enough for a centralized logging system.

At the end of the day, you have to view it as building a reliable system that performs a function. Docker is one tool you can use to do that. Virtual machines are another tool. They don't solve all the problems you describe, nor are they intended to. If you're a tiny startup, you can just go the AWS route, but that leaves you beholden to AWS and their pricing. That's fine early on, but eventually you'll want to go full-stack for one reason or another.

There's a few projects out now that do most of this for you; there's a lot of rapid innovation in higher-level docker tools. eg https://github.com/remind101/empire (built ontop of EC2/ECS) You get a 12 factor compatible PaaS out of it, pretty easy.
The funny part is that Docker was supposed to be higher level.
it is? its one more step up the chain towards the ultimate goal: being able to run M isolated instances of N different apps automatically distributed across Y physical hosts (and being able to deploy app A without caring about any of this) We're almost there.
I have a feeling this is not just a problem with Docker. People tend to choose technologies not because they solve their problem, but because it's hip to be using the newest stuff, even if it's far too big and complicated for their simple usecase.
In this regard I think the remarks of McKinley's "Choose Boring Technology" [1,2] is quite relevant.

[1]: http://mcfunley.com/choose-boring-technology-slides [2]: http://mcfunley.com/choose-boring-technology

Big thanks for this links. Actually this is really true for Docker and DevOps. There are proven concepts and known unknown but for Docker the unknown unknown part is really scary especially regarding security for production. Maybe for bigger companies this is no problem but for small dev teams this is very risky and time consuming.

Just one non trivial example: I can secure Ubuntu against sshd attacks pretty good and easy with `sudo apt-get install fail2ban`. Now try to secure CoreOS against sshd attacks. There are guys out there who tried to run fail2ban in a container (without luck) and so far I've only found one hacky script which tries to do the same oO https://github.com/ianblenke/coreos-vagrant-kitchen-sink/blo...

CoreOS is not docker. You can run docker on ubuntu and install fail2ban on it. I don't see the problem here.
Yes, exactly. I really can't recommend that my company use this right now because of the complexity. The Hello World image is easy, but after that, there's quite a bit involved. And my company is already happy enough with spinning up VMs.
That's what you got from the article? I got the opposite. It works fine until you start to have large, complex images where build time becomes a factor and the fundamental design of Docker starts to get in the way (e.g. how it manages diffing of images, the lack of caching, and the inability to build different parts of the image in parallel). These shouldn't be as big of a deal at smaller scale.

That's not to say you're wrong; containers probably aren't that useful to most small shops. But that summary doesn't make any sense for this article.

A number of issues discussed in the article would be factors regardless of the scale: --logging, secrets, edgy kernel features, security

Also, see https://titanous.com/posts/docker-insecurity

Same here. AMI with direct from git updates when they start. Docker as part of the build and deploy process of upgrades is gonna be too costly anyway, because setting up the first configuration takes time, moving the containers take time etc.
More accurate TL;DR: Docker and containerized architectures generally would be improved by solving this list of problems.
Is it too complicated because people already have complicated set ups? or just in general?