Hacker News new | ask | show | jobs
by Animats 4233 days ago
"All problems in computer science can be solved by another level of indirection" - David Wheeler

That's what "containers" are, of course. There's so much state in OS file namespaces that running any complex program requires "installation" first. That's such a mess that virtual machines were created to allow a custom OS environment for a program. Then that turned into a mess, with, for example, a large number of canned AWS instances to choose from. So now we have another level of indirection, "containers".

Next I expect we'll have container logistics management startups. These will store your container in a cloud-based "warehouse", and will continuously take bids for container execution resources. Containers will be automatically moved around from Amazon to Google to Rackspace, etc. depending on who's offering the lowest bid right now.

6 comments

It's more like a ping-pong. Things start off simply, but over time as the layers of abstraction pile up, things become brittle and unworkable.

I view containers as more of a reworking of a key computational abstraction (VMs) than an evolution of them. We finally have operating systems with enough inter-process isolation, sufficiently capable filesystems (layering), etc. that we can throw out 80% of the other unnecessary junk of VMs like second kernels, duplicate schedulers, endless duplication of standard system libraries, etc.

So it's more like we've hacked/refactored virtualization into a more usable state, and gotten rid of a lot of useless garbage that it turns out we didn't actually need. It's a lot like how a big software system evolves, now that I think about it.

I'm genuinely curious, although a bit naive WRT containers. Outside of an aesthetic preference (for being able to remove 80% unnecessary cruft), what is the advantage of containers? I was under the impression that VM overhead was marginal in terms of today's computing.

I ask because I'm familiar with VMs, having worked with them extensively for a number of years. VMs work quite well for any application I've needed, so what would be the benefit of switching to containers? I've got lots to do, and lots to learn, but I can't see learning containers (and being out of sync with the rest of my coworkers) being a priority.

But I'm willing to change my mind if there's a concrete benefit. Right now, VMs work just fine, but maybe there's something I'm missing...

VM overhead isn't trivial. It still remains a pretty big factor in terms of cost bloat for CPU-bound stuff. Also, VMs take a godawful long time to start up; if you care about, say, responding to load within ten seconds, VMs aren't a great choice.

They're fine for a lot of things, of course. I use them all the time. But I use containers for other things.

I recall several reliable testers confirming that the CPU overhead of virtualisation was negligible, somewhere around 2%. Unfortunately I could not quickly find those papers now, but I did find a old VMWare whitepaper[1] showing they had ~7% overheard 5+ years ago, which sounds about right considering what kind of advancements they would have made in half a decade.

[1] http://www.vmware.com/pdf/hypervisor_performance.pdf

Sounds feasible, but CPU usage isn't really talked about as an advantage of containers.

I expect startup time and memory usage would be lower, but to my mind the advantages are mainly around flexibility... e.g. How long it takes to create or upload an image file. How long it takes to set up a minimal infrastructure with several components to it on a single EC2 instance. Decoupling the operating system patch cycle from the app deployment image generation cycle. etc.

It's just MUCH more memroy efficient to run containers and also VMs typically have worse I/O throughput. CPU scores are fine though.

As an example i am running around 20 containerized servers on my Laptop in a 4GB VM which would typically be run on 20 distinct VMs on one or more hypervisors. It's not very fast but the density of servers you can put on your hardware is MUCH bigger.

"if you care about, say, responding to load within ten seconds, VMs aren't a great choice."

That's actually exactly why I would use a VM..

You can spin up, configure, and push into production an application in a new virtual machine in ten seconds? I'd like to see proof of that.

The best I've managed was ninety seconds on my own hardware and three minutes (on average) in AWS.

Ah sorry! I didn't think you meant literally "10 seconds", was assuming you just meant quickly (a few minutes).

I can't really think of a use case though where someone would need more capacity in sub 10 seconds. Maybe if you only intend to scale horizontally with a bunch of 500Mb instances and had little to no room to set an appropriate scaling threshold? What would be a couple examples? With the apps I've seen the past several years generally they have scaling thresholds at 'X' resource and 3 minutes is more than enough to provision extra capacity for their needs.

Also, kind of ironic but your site is giving me a 503 :p
Just lighter weight. Ways that containers can be cool:

You need to start 10 containers locally to dev your app. Spinning up a local 10 VMs kind of sucks. 10 containers can be pretty quick.

(now if you really need 10 containers is another question, and some people clearly over split their architecture).

Containers are just a way to launch threads without polluting your local namespace or system. It's a way to say "hey, this stuff shouldn't interfere with anything else".
Well, we've had various containers such as BSD jails, for decades. The useless garbage wasn't necessary. Seems like ping pong happens whenever "kids these days" don't know why the status is quo then have to relearn the old lessons.
IMO, the problem is that your standard OS has way too much stuff running.

A SaaS app running in production should be about the size of your binary, and the libraries it uses. Instead, we have X, smtp, terminals and a full filesystem running. home directories and uids make no sense in an app that uses no unix users except for the one you're forced to use.

I'd really like to see a much smaller, simpler, non-posix OS for running server apps.

I haven't had a chance to play with it, but I ran across this project the other day: https://github.com/cloudius-systems/osv

"OSv was designed from the ground up to execute a single application on top of a hypervisor... OSv... runs unmodified Linux applications (most of Linux's ABI is supported) and in particular can run an unmodified JVM, and applications built on top of one."

This article [1] lists a couple of other "cloud OS" systems. OSv and mirage [2] seem to be the two most promising ones right now.

1: http://www.linux.com/news/enterprise/cloud-computing/751156-... 2: http://www.openmirage.org/

I'd really like to see a much smaller, simpler, non-POSIX OS for running server apps.

The POSIX system interfaces (read, write, open, close, etc.) are OK. It's the Commands and Utilities that are the problem. Do you really need Bash available? How much of the 50,000,000 lines of Linux need to be inside your VM running your one web application? How much attack service is provided by the presence of all that stuff?

There's a project which has taken the C runtime library and made it run on a bare VM, so you don't need an OS instance at all. If you're just running one program, that makes a lot of sense.

This doesn't really pencil out..."your binary, and the libraries it uses" can easily get into the GB when you include components like the .NET framework or java base class library. I don't know exactly how large a fully-loaded NPM repo with warm cache or warmed-up rvm installation directory are, but it isn't tiny.

Second, POSIX is a standard for how the operating system API works that has nothing to do with what packages are installed -- and it's a pretty low-level API, for doing stuff like read, write, fork, exec, etc. This isn't what's adding bloat.

In this case, the problem isn't being solved -- solving the problem would mean moving away from dependencies on the global OS namespace by relearning how to write self-contained applications (some people never forgot).

Containers are just a big wad of duct tape holding together the ball of mud that comprises most web applications' server-side components.

Add containers, and you haven't solved the problem, you've just made two problems.

It's great for those nasty legacy apps that only work on old unmaintained versions of Rails or old OS Versions etc.

Take all the nastiness and throw it into a box, without needing to contact Ops to reserve memory and provision a VM.

IMO, it's one of the major reasons why Enterprises get so excited about Docker. Legacy app dependency issues are horrible once you get past a certain scale.

VM's are expensive and non-self-service at most orgs since they tie up RAM and licenses.

Functions are just a big wad of duct tape holding together the ball of mud that comprises most applications' lines of code. Add functions, and you haven't solved the problem, you've just made two problems.

Does this sound true to you? What makes containers any different from the organization that the abstraction of "functions" bring to ordinary sequential programs?

Containers abstract over reducible complexity.

Functions (should) abstract over irreducible complexity.

As long as we're asking hypotheticals — why do applications need to control the global OS namespace and the dependencies between elements in that namespace to a degree that the applications themselves can't be easily deployed without containers?

Can that problem be reduced?

If not, why not?

It's not really adding another level of indirection, it's taking one away. The pain of change remains in that you have to internalize yet another new layer, BUT at least this way you get to leave VMs behind. It's trading one layer for another slightly more granular one instead of piling another one on top.
It's amazing to me how much easier the process is for this nowadays. We really live in an exciting time.
Bracket Computing is basically doing this. Funded by a16z too. https://www.brkt.com/