Hacker News new | ask | show | jobs
by crabbone 698 days ago
This is super, super, super naive. You, essentially, just solved for the case of one. But now you need to solve for N.

Do you seriously believe you will never be in a situation where you have to run... two containers?.. With two different images? If my experience is anything to go by, even postcard Web sites often use 3-5 containers. I just finished deploying a test of our managed Kubernetes (technically, it uses containerd, but it could be using Docker). And it has ~60 containers. And this is just the management part. I.e. no user programs are running there. It's a bunch of "operators", CNIs, CSIs etc.

In other words: if your deployment was so easy that it could all fit into a single container -- you didn't have a dependency problem in the first place. But once you get realistic size deployment, you now have all the same problems. If libssl doesn't implement the same version of TLS protocol in two containers -- you are going to have a bad time. But now you also amplified this problem because you need certificates in all containers! Oh and what a fun it is to manage certificates in containers!

> Kubernetes is an abstraction layer that (mostly) hides the complexity of storage networking etc

Now, be honest. You didn't really use it, did you? The complexity in eg. storage may manifest in many different ways. None of them have anything to do with Kubernetes. Here are some examples: how can multiple users access the same files concurrently? How can the same files be stored (replicated) in multiple places concurrently? What about first and second together? Should replication happen at the level of block device or filesystem? Should snapshots be incremental or full? Should user ownership be encoded into storage, or should there be an extra translation layer? Should storage allow discards when dealing with encryption? And many, many more.

Kubernetes doesn't help you with these problems. It cannot. It's not designed to. You have all the difficult storage problems whether you have Kubernetes or not. What Kubernetes offers is a possibility for the storage vendors to expose their storage product through it. Which is nothing new. All those storage products can be exposed through some other means as well.

In practice, some storage vendors who choose to expose their products through Kubernetes usually end up with a limited subset of the storage functionality exposed in such a way. So, not only storage through Kubernetes doesn't solve your problems: it adds more of them. Now you may have to work around the restrictions of Kubernetes if you want to use some unavailable features (think, for example all the Ceph CLI that you are missing when using Ceph volumes in Kubernetes: it's hundreds of commands that are suddenly unavailable to you).

----

You seem like an enthusiastic person. And you probably truly believe what you write about this stuff. But you went way above your head. You aren't really an infra developer. You kind of don't even really recognize the general patterns and problems of this field. And that's OK. You don't have to be / do that. You just happened to be a new car owner who learned how to change oil on your own, and you are trying to preach to a seasoned mechanic about the benefits and downsides of different engine designs :) Don't take it to heart. It's one of those moments where maybe years later you'll suddenly recall this conversation and feel a spike of embarrassment. Everyone has that.

1 comments

Looking at my company's Rancher dashboard, it looks like I'm currently running about 7500 pods. Assuming 1.5 containers/pod (probably high) then I'm not running 1 container, I'm running about 11 thousand containers right now. Please don't assume I can't understand what you're saying because of any particular level of experience. Your points are just as understandable regardless.

I'm not sure there's a real usecase for running multiple versions of the same app at the same time tbh. If the devs have a new version they're tying to push out then first their branch has to pass automated tests before it can be merged to master, (mostly) ensuring old functionality doesn't fail. Then our deployment pipeline deploys it to staging, makes sure everything is healthy and readiness probes are returning 200, then deploys it to prod, makes sure everything comes up, and finally switches the k8s service to point to the new pod versions. If anything breaks at that point, the old pods are still around and I can swap the k8s services to point to the old deploy instantly.

If, for example, two versions of libssl are somehow treating the same protocol version differently, then that'd be detected on staging at the latest. If the devs know they need to upgrade protocol versions from (for example) TLS 1.2 to TLS 1.3, then they'll deploy a version that runs on TLS 1.2 and 1.3, then once everything is working deploy a version that works only on TLS 1.3. Nothing actually takes production traffic until we're fully assured it's healthy. We haven't had a maintenance or upgrade outage for at least 3 years.

Could all this be replicated on a VM platform assuming it has an appropriate API? Definitely. But k8s has all this covered already. How do I switch traffic from the old pods to the new pods? The deployment pipeline runs `kubectl apply -f ingress.yaml` and k8s patches all the load balancer configs to point to the new pods. That's the entirety of what I would need to do if it wasn't already automated.

Certificate management is also pretty easy. Each pod pulls a cert from our PKI (Hashicorp Vault) when it starts up. If the leaf cert expires (unlikely because pods are usually replaced by a new version well before then) then the app throws an exception, the pod goes unhealthy, k8s restarts the pod, the new pod gets a new cert, and it's good for another ~year. This is completely automated by k8s.

Cert management for the k8s nodes themselves actually does involve VMs a bit. Some of our clusters are on AWS EC2 and are set up with autoscaling groups so that if a node has too little usage it'll be downscaled, so if a cert is close to expiring then the node as a whole goes unhealthy, k8s automatically removes all pods from that node and spins up new replicas on other nodes, EC2 detects that load is low and downscales that node, and if spinning up new pods caused the other nodes to have too much utilization then EC2 will spin up new nodes with new certs and everything will be fine for another ~year. Other clusters run on on-prem VMs and we haven't completely automated that yet so those are still manual restarts.

Every few years the root cert will expire and we'll have to restart all the pods or nodes at once. Pods are easy; just redeploy and they'll all get the new cert, or worst case I can run `kubectl delete --all pods` and the PodDisruptionBudgets will ensure that there's a rolling rollout. For nodes, I'll scale up the cluster (increase min replicas in ec2 or add more nodes through the VM platform) so there's a bunch of nodes with the new root cert then drain all the existing nodes which will cause k8s to spin up new app pods on the new uncordoned nodes, then shot down all the old VMs or let ec2 handle it.

You're right that k8s doesn't help with app-level storage issues like concurrent access, nor does it help with storage-level issues like backups and replication. I should've been more specific that k8s helps with how the apps connect to storage. While migrating VM-deployed apps to containers I've found a few ways they've done it: config files specifying connection strings, hardcoded strings in code, pulling values from secrets management, requiring that the VM have the fileshare mounted already, etc. In k8s there's one way to do it: the app's manifest includes a PV and a PVC. Ops handles how k8s connects to the storage from there. This isn't really a k8s advantage; you could tell all your devs to use some internal library that abstracts storage and let ops write or maintain that library too. But that really only works with one company at a time, while when we onboard an acquisition that uses k8s they've already got PVs set up so we just have to migrate those. My point in saying that k8s abstracts connecting to storage was mostly about how it's an industry standard interface specifically for connecting to storage, which helps eliminate having to figure out how each individual app connects. If security makes a firewall rule that blocks all your VMs from hitting storage then for VM-deployed apps I've got to look "ok did the devs change this config file? Did someone forget to mount the fileshare or did an update break that? Is it some third option I've never seen before?" while for our k8s-deployed apps I've got one place to start looking using kubectl.

Another point I didn't address is that yes this does require specific app architectures. The pods have to be stateless, databases are not in k8s and certainly not running alongside the app itself in the same container or pod, concurrent file access is not generally my problem, and security's wacky firewall rules can be fun to implement when I can't say what IP a particular app has. But I think the tradeoffs are generally worth it.

You're right I'm not the most experienced at large scale infrastructure problems outside of k8s. I've managed or helped manage a couple of small server racks and a single 6-rack datacenter before, and I work closely with the non-k8s infrastructure team at my current company, but I'm not the one deciding what we're going to do to get off of VMWare for example. What I can say though is that between my past experience and the companies we've acquired, there's a lot more variation and lack of best practices among the companies that don't use k8s compared to the ones that do. With the non-k8s companies I have to familiarize myself with the idiosyncratic way each they handle every aspect of their infrastructure; with the k8s companies I already know at least half of their infrastructure.