|
I use Ansible as well and it does not allow me to scale my app containers easily, especially because of dependencies. I think I either have to manage _everything_ by Ansible including firewalls, IRB/SVI on switches, Openstack security zones, users, roles etc. or I am asking for problems and they will definitely come. NB I do not like to manage my switches, routers and firewalls with Ansible. But my infrastructure is fairly complicated, some bare metal services, many VMs, several docker hosts with dozens of various app containers, private Openstack and some public cloud services (VPNs), Ceph cluster, nagios, ELK etc. Ansible does not watch and maintain the infrastructure and services state - it is a passive tool (I am not sure if Tower is different). You definitely could configure your monitoring to invoke Ansible script if your app server goes down and reschedule services that ran on that host to a different one and reconfigure other parts of infrastructure as needed (dependencies, VXLANs, security rules, service discovery db update, load balancing update etc.) But you would essentially copy the Kubernetes controller functionality. Kubernetes can take care of the service discovery, load balancing, deployment, configuration and networking and other parts of your infrastructure and it does it pretty well in my experience. It maintains the declared state and reacts to its changes. Somebody posted here that it takes just several hours to learn it. YMMV but deploying a resilient cluster in my environment took me much more time. And I have to agree that documentation is pretty weak. E.g. provisioning Cinder volumes and connecting them to the VMs running Kubernetes nodes was a real horror. Btw Ansible generates many of my k8s YAMLs and deploys them. |
Yeah, but what I've done is that i've deployed a tiny go binary on each cloud instance that is run every few minutes. Instances take turns checking on each other using a round robin sort of approach (no complicated leader election algorithms etc...)
The script knows how to check the health of the other instances/services and to restart them or alarm if they get stuck.
For fifty hosts, it works fairly well.
I didn't say in the original post, but we are not a product group, so our stuff is 'semi production'. We don't have customers to worry about.
>Somebody posted here that it takes just several hours to learn it.
Sure I can probably learn it quickly, but what I don't want is to now have complicated and mysterious kubernetes problems to solve on a deadline.
I understand linux pretty well, been using it for 20+years, so I'm not intimidated by OS level troubleshooting. Sure without containers, you have to be more careful to keep your dependencies from overlapping, but it hasn't been a problem we couldn't handle till now.
I don't particularly want to trade problems I'm familiar with solving for a whole new set of unfamiliar problems unless there is a clear benefit.