Hacker News new | ask | show | jobs
by jacquesm 4390 days ago
> Personally, I think we are on the cusp of a transition from VPS (xen/hvm) to VPS (containers).

I'm not so sure of that. I think a lot of the use-cases for VMs are based on isolation between users and making sure everybody gets a fair slice. Something like docker would work well with a single tenant but for multi-tenant usage docker would give you all the headaches of a shared host and very little of the benefits of a VM. For those use cases you're probably going to see multiple docker instances for a single tenant riding on top of a VM.

The likes of Heroku, AWS, Google etc will likely use docker or something very much like it as a basic unit to talk to their customers, but underneath it they'll (hopefully) be walling off tenants with VMs first. VMs don't have to play friendly with each other, docker containers likely will have to behave nicely if they're not to monopolize the underlying machine.

3 comments

I want option 3. A 4U rack with 32 completely isolated embedded stand alone quad core ARM or PPC systems, a network switch and an FPGA on each connected to the switch fabric.

Then we can start doing some interesting stuff past finding new ways to chop computers up.

Very interesting, that would be something I'd buy just to mess around with, I can think of a few ways in which I'd use it right off the bat and if you give me couple of hours more I'll have a whole raft of them :)
That does not sound very high density compared to what you can get from a company like Baserock - http://www.baserock.com/servers
I want a hefty FPGA attached to the CPU bus and switch backplane. That will take a lot more power than the ARM core.
32 ARM chips in 4U seems very low to me, just in terms of the TDP a 4U rack is able to dissipate at present. You could increase density a lot.
You could but I want standard storage per node (PCI-E FLASH), redundant PSUs and the TDP of a hefty FPGA going flat out is a lot larger than that of the ARM core.
> I think a lot of the use-cases for VMs are based on isolation between users and making sure everybody gets a fair slice.

Containers do this.

Hm. We'll see about that. I can see a whole pile of potential issues here with 'breaking out of the docker' on par with escaping from the sandbox and breaking the chroot jail, which I see this as a luxury version of.

Of course you could try to escalate from a VM to the host (see cloudburst) but that's a rarity.

Docker seems to be less well protected against that sort of thing, but I'm nowhere near qualified to make that evaluation so I'll stick to 'seems' for now. It looks like the jump is a smaller one than from a VM.

Fair usage of resources and security isolation are two VERY different problems. Containers can be VERY good at resource isolation. Security has not really been figured out yet.
This isn't really a "we'll see" issue. It is a fact that containers do resource isolation. :P The security issues are orthogonal.
Containers don't isolate very well. One thing that is easy to do is to make the system do disk output on your behalf just by making lots of dirty pages, or make the system use lots of memory on your behalf due to network activity. And of course there are the usual problems that you already have with VMs such as poor cache occupancy.

Shared hosting of random antagonistic processes is something that many developers are not quite ready to embrace. If you are willing to run your service with poor isolation and questionable security then containers are just the thing. You'll definitely spend less money if you can serve in such an environment.

I beg to differ. If you manage to break out of a container then all the resources of the machine are at your disposal.

So they're orthogonal only as long as the security assumptions hold.

I don't know where this myth came from that you NEED VMs for fair slicing. The Linux (and most other OS kernels) have been doing fair slicing just fine for years. I think the disadvantage of containerization is similar to those of OpenVZ VPSes: you can't partition your harddisk and you can't add swap space.
It's not a myth. A VM is effectively a slice of your computer that you can pre-allocate in such a way that that VM can not exceed its boundaries (in theory of course, this works perfectly, in practice not always).

So all other things being equal, if you slice up your machine into 5 equally apportioned segments and you run a user process in one of those 5 slices that tries to hog the whole machine it will only manage to create 1/5th of the load that it would be able to create if it were running directly on the guest OS.

So yes, linux does 'fair slicing' if you can live with the fact that a single process will determine what is fair and what is not. That that process gets pre-empted and that other processes get to run as well does not mean the machine is not now 100% loaded.

Using quota for disk space, 'nice', per-process limits for memory, chroot jails for isolation and so on you can achieve much the same effect but a VM is so much easier to allocate. It does have significant overhead and of course it has (what doesn't) it's own set of issues but resource allocation is actually one of the stronger points of VMs.

Well yes, but kvm is a vm thats just using Linux to do this vm scheduling. The main issue is that the API for containers is less well defined (IO scheduling is not necessarily fully fair with VMs, but its mainly aio on the host side at least).
You can add swap space, and there is even swap space acounting support in the kernel. Personally I don't use swap, I just buy fat amounts of RAM and allocate them to diskless worker-nodes in my clusters. As for partitioning, manual partitioning can give a slight speed advantage (if you know which filesystem you want to use, you have a long enough lived job to justify optimization, etc.), but generally you can just use http://zfsonlinux.org/ or at least LVM2 to avoid the segregation requirement entirely. In the former (ZFS) case you get arbitrary-depth COW snapshots, dynamic reallocation, transparent compression, and other types of useful options for ~free, as well. In the latter (LVM2 LV) case you get single-depth snapshots (though in theory this is improving; eg. via thin provisioning) but no dynamic resizing support (AFAIK, unless you use nonstandard filesystems).