|
I am going to state a somewhat unpopular opinion, Kubernetes in their current form are just short from unusable in complex production environment. In order to run a full Kubernetes stack, you need to get a multitude of components running, each with own name with unknown effects on the underlying system. I have experiences with two forms of deployment: one was an on-prem installation two years ago, on which one of the 13 Java 8 applications had very large latencies when accessing Oracle DB, otherwise working fine when it was deployed on simple VM. All of those applications had been done with the same DB logic, and we couldn't find the issue on our own, so we asked third-party to debug this issue for us: they couldn't pin point the problem, even with commercial tools. Their answer was just, something is off with your Kubernetes installation and that was it. My second, on-going experience, is my current assignment with Fortune 500 company, that uses GKE for running hundreds of nodes, after migrating from on-prem VMs. Almost every other week (99% reliability - yeah right), some part of the system just dies and leaves services unreachable or unresponsive. There is a continuous effort to solve this issues and even Google support was contacted with the answer boiling down to: shit happens, deal with it. The only solution in those situations is either to have alarms go off so that Ops can restart something, or just to wait until everything comes back up again on its own. The whole ecosystem was a good idea that lacks proper tool and stability to provide substantial benefits over the bunch of VMs, IMO. |