Mind you, it's hard to compare these as there's no real "cloud bench". For pure benchmark porn Nomad are the undisputed champs on their 1 million case.
The Cloud Foundry scaling test was intended to show a system with fully service-configured, fully-routed apps, with varying app characteristics (memory and RPS). To further stress the system, thousands of apps crashing and are relaunched on a continuous basis.
Cloud Foundry installations with >10k containers have been ordinary for a while now; the 250k thing was to ensure we had lots of headroom and shake out chokepoints in Diego.
Big respect for your achievements. I guess at some point it just becomes the question of "where do i get a 1000 nodes" vs. "how do I run a 1000 containers". Or, more the justification for that amount of hardware - I mean, the one dream job which I would probably want is getting paid to cut out all the hardware use while keeping reliability/availability/functionality. Like these guys who cut their AWS bill by $1mil/year in about 3 months - https://segment.com/blog/the-million-dollar-eng-problem/. The thing is that I'm not exactly sure where I'd fit in more - running this thing, or just fixing it for somebody else. I definitely know that I'm mostly dealing with pets and not cattle :)
Well, Cloud Foundry is deployed by BOSH. So you can, if you wish, use RackHD to deploy it to naked hardware (instead of OpenStack, GCP, AWS, Azure and I forget what else).
Your apps will still be containerised, distributed and wired up the same way.
There's always a point at which it makes engineering sense to flip the switch to doing it yourself. But that frontier is never static. We (plus our peers in the Cloud Foundry Foundation) and others in this space like Red Hat OpenShift are constantly pushing back the tipping point at which it makes economic sense to DIY.
We already have very large customers with very large engineering teams, who've built platforms before. And they are switching because that effort no longer makes business sense. It's an expense they don't need for a platform they're the only maintainers of.
One of our peers at IBM wrote about DIY[0]. We have our own much more markety-businessy whitepaper, with a very detailed case, on the same topic[1].
I would use IPv6 for the orchestration network, probably not touch the tcp/ip parameters except for port range (and open file descriptor), and break up the broadcast domain into smaller networks. It is not advisable to have thousands of machines on one broadcast domain, and it is a pain in the ass to troubleshoot, not to mention causes bigger headaches when one network problem affects all the nodes across the entire gigantic network.
Does anyone know how easy it is to set up autoscaling with Docker Swarm running on Google Cloud or AWS? We're looking to get starting with Docker Swarm or Kubernetes soon, and are considering using Docker Swarm because of its simplicity and developer familiarity with Docker Compose (we use it for our dev environment). We just want to add nodes to a cluster as traffic spikes and subsides.
Google Container Engine supports cluster autoscaling to automatically add nodes with load. It's listed as a beta feature though.
I've tried most of the Docker orchestration offerings and Container Engine seems by far the nicest. Swarm and Compose are really simple for getting up and running, but when we evaluated them there was still a missing piece required in that there was no neat way to do zero downtime deployments.
There's a tool called Kompose to convert docker-compose config to kubernetes manifests (https://github.com/kubernetes-incubator/kompose) although whilst it's nice to get you started we tend to maintain them separately now.
It will setup your swarm, which uses auto scaling groups for the worker nodes. You can then configure the auto scaling groups how ever you want, to scale based on your cloudwatch metrics, etc.
There is also a Docker for GCP product in beta. https://beta.docker.com but I don't know how auto scaling works for it.
Disclaimer: I work at Docker on the Docker for AWS product.
I think most people would suggest that, if your use case is at a stage where that is important to you, Swarm is not the right thing. Kubernetes or ECS are better choices.
I wouldn't recommend ECS. I've used it for a little over 6 months and even for trivial things, it lacks. A couple examples that come to mind include not being able to pass host environment variables into your container instances easily, and not being able to specify that a service must run on all hosts.
Theres an open issue (made ~2 years ago) on GH for the 1st example and it still hasn't been resolved.
In the context of Docker Swarm and Kubernetes, autoscaling refers to container level scaling ie. given a set of nodes, any autoscaling function would manage the number of containers that are currently running on these nodes.
For instance/node level autoscaling (which is closer to what you need), I would recommend using the autoscaling features provided by AWS/Google Cloud.
> I would recommend using the autoscaling features provided by AWS/Google Cloud
It would have to be integrated with Kubernetes though -- when we push a new docker container, the container would need to be updated on any new machines created. We'll look into GCP's autoscale solution.
The node level autoscaling doesn't need to be integrated with kubernetes, all it needs to do is create a new instance and register it as a node through normal channels.
Even if you don't need autoscaling, I'd suggest still using autoscaling groups and setting it to a fixed number of instances, so that instances will automatically get restarted if they go down.
Yeah, any new machine instance has to join the Swarm (and its equivalent in kubernetes-speak). But that can be decoupled from kubernetes or docker swarm mode.
As for image management, it would depend on how you would like to propagate new images. With a private docker registry, you could potentially point each new instance to the registry and take care of propagating new images. I favor this approach since it keeps everything separate and easier to manage.
I always wonder, why not isolate on a process level, or even withing a single, multi-threaded app. Sure you can run some sort of web service on hundreds of docker containers or you can run a single, fast web server that scales?
Agreed, though I keep wanting to take the time to get VRRP working with a web server to have redundancy. OpenBSD uses this to coordinate stateful firewalls with 2 or more systems, if 1 goes down all state info is present on the second node which takes over.
Hi, OP author here: I have actually set up a VRRP (well, UCARP) on Docker, so it's possible even to containerize this facet of running a HA ops stack with Docker as the infrastructure. It is however, as you say, it is only used for one active node + a number of fail-overs in case that one goes down. In terms of maintenance (hosts do go down, scheduled downtime is common), it's priceless to have this part of the puzzle portable as well. If you want to check it out, there's a github available here: https://github.com/titpetric/ucarp-ha - and a future article with it is planned as well. It will also become a part of the E-book which I'm currently working on and publishing on leanpub: https://leanpub.com/12fa-docker-golang :)
Actually, neither should be a problem if you have enough redundancy :) the hardest part of rolling your own infrastructure is testing mission critical systems (like databases) to be fault tolerant and at the same time reliable. Lots of great projects are out there that address some of these issues, but it takes a lot of attention to details (like transaction rates, ACID compliance, replication, etc.) to get it right. This is why a lot of developers which aren't in unicorn startups take advantage of technology which is available from giants like Amazon or Google, or specific problem-domain companies like CloudFlare for example. Netflix serves as a great example of a technology-driven company that is an inspiration to us, but there are so many others that really changed the way we approach problems - Tumblr, Etsy. But to stay on topic of netflix - I think their idea behind "chaos monkey" is great, and we're increasingly rolling out a (currently simple) docker swarm version of it - https://github.com/titpetric/docker-chaos-monkey - the best way to eliminate worry is to test failure scenarios. As docker chaos monkey is designed to unpredictably "kill off" containers, your system gets the benefit of design to handle failures. It's one of those problems that you have to have a passion for however - it's like testing software. You're only testing software for the functionality and failures which you can predict, and I'm pretty sure that any of us can't predict all the ways in which software (or distributed systems) can fail. As such, it's a never ending occupation. :)
But if a single process has the single account on the database, how do you partition those permissions? Simply providing multiple logins won't help if you assume hostile code is in your process space.
On the other hand, if each service has its own login, then the database can enforce lowest authority for each. A compromise of one service isn't a game over scenario.
It's the difference between having a single account with the union of all permissions, or disjoint sets.
Cloud Foundry uses Garden which uses runC. But our Garden had a container system that predated docker and nspawn. So probably another case of Not Invented Yet Syndrome.
Can someone that needs to run workloads like this explain to me why this is needed? It sounds like over engineering for the sake of it. There are only so many apps at Facebook scale in the world.
I'm still interested in how to merge features like AWS Autoscaling with Docker to right size the underlying infrastructure for the amount of container work going on.