Hacker News new | ask | show | jobs
by jsmeaton 3390 days ago
Sure, that's a valid question to ask. But imagine you have 1,000,000 customers. Now you have to calculate and manage scaling groups for 1,000,000 customers * number of services. The resourcing costs alone would be outlandish, not to mention trying to independently scale each customer. Perhaps container systems would make this easier, but do they have better memory isolation? Is it possible for a container process to overrun into another containers memory without an exploit in the container system?
2 comments

A container is a process with some extra isolation (namespaces), they certainly can't overrun into each other without an exploit.

Why would the costs be outlandish? We offer that and we're fairly cheap. Since the cost is mostly fixed per customer, it should scale linearly.

As for scaling, they already have to do that, by pointing different requests at different servers depending on their load, etc.

Assuming for a minute that containers aren't in play, then the isolation model becomes that of a server/vm with the associated overhead of each. To make this easier we'll assume there's only a single service, even though we know this to be untrue.

If there are 1M customers that's a minimum of 1M servers. Some customers are obviously larger and would need more. There's also HA. Let's conservatively call it 2.5M servers.

At an absolute bare minimum we'd need to allocate 2.5M GB of ram and 2.5M vCPU. That's a huge amount of resources.

If you could reliably fit 10000 Small customers on a single server at 32gb ram and 8 cpus you can already start to see how many resources can be saved.

Without customer isolation you've got the entire cluster to handle load spikes and HA. With isolation you have to have scheduling monitoring each of the 1M clusters and scaling appropriately by anticipating demand.

Scaling a service is way easier than scaling customers within a service or many services.

Assuming for a minute that containers aren't in play, then the isolation model becomes that of a server/vm with the associated overhead of each.

Why? There's nothing magical about a container, it's literally just a cgroup of Linux processes. You don't have to use them to get the memory isolation we're talking about - uncontained processes get it too.

That's what we do: one process per client, uncontained, just running on a different system user.

But in any case, sure, use containers, I'm certainly not opposed to them.

That really isn't practical given the number of datacentres CF are in * the number of free customers they have.

Perhaps for some tier of paid customer.

There are many ways to ensure a system is fault-tolerant, scalable and still reasonably safe.

They certainly have the resources to solve this if only they want.

All I can say is that I am not willing to pay for something so fragile, but that is only my own opinion.

In the real world things may go differently: they exposed critical data back in 2012 too and they're still here ...

Linux namespaces/containers create a memory page table completely separate from the host's so barring vulnerabilities in the container implementation that allow mapping host physical memory to guest virtual, isolation is strictly enforced by the memory controller in the hardware. Without an exploit, the worst case scenario is leaking shared library read-only sections across containers (since the physical memory might be shared for a smaller container footprint, although i don't know if LXC supports that yet).
> Linux namespaces/containers create a memory page table completely separate from the host's

Each _process_ has its own memory page table. Containers are built out of processes, so they inherit this attribute.

Namespaces have nothing to do with it.

Sorry, I should have elaborated: with namespaces, each container instance gets its own process table with separate non-shareable pages (without KSM or other dedup feature) and then each container process gets its own page tables, like they normally do. The point is that there's an extra level of isolation beyond just processes, although there is still the kernel attack surface.