Hacker News new | ask | show | jobs
by yongjik 2219 days ago
Well, nobody asked me, and I'm no expert, but here's my list of what (not) to do in Kubernetes (if I had the authority).

1. There. Is. No. Machine. (Insert matrix meme here.) Before you open up your cluster to the rest of company, drill it down to them. Maybe even create a Google Form where they have to sign "I hereby acknowledge that there is no machine in k8s and any attempt to tie my job to a particular machine means a broken config by definition."

2. Thanks to 1, don't let anyone use hostNetwork, hostIPC, hostPID, hostPorts, host whatever, unless you have a really good reason to (with explicit approval process).

3. Don't let anybody start a job without memory/CPU limit. Make sure they understand that, if the job goes over the memory limit, it dies, and it's not k8s admin's problem.

4. You can't log anything into the pod - when the pod dies the log is gone. You can't log into the machine, either (see 1). Therefore, you really need some kind of logging framework that takes the log from your pod and saves it, in its raw form, somewhere safe (like S3). I don't know if there's any such framework, but there had better be.

5. Make sure every manual operation is logged (who did what to which job when), unless you like asking "@here Does anybody know who owns fooservice?" every month.

6. Kubernetes is not magic: if it takes thirty minutes to provision your service, fix that, instead of moving thirty minutes of manual provisioning into k8s and somehow expect it to be magically reliable.

7. Don't bring in existing dependencies uncritically. If your job connects to a zookeeper server to find out its peers, don't bring it into k8s, but rewrite it to use k8s service instead.

8. Take extra extra care when writing down your first job specification, because there are a lot of yaml files to write, and people will just copy what's already there. If your first k8s job mounts host /tmp directory just because you were testing something and forgot to delete the line, soon you will have fifty jobs all mounting host /tmp directory. Good luck figuring out which job actually needs it then.

Yeah, again, I'm by no means an expert - I'm not even an admin, so just consider the list as a rambling of some poor soul who has seen some stuff. Here be dragons, have fun.

3 comments

> 8. Take extra extra care when writing down your first job specification, because there are a lot of yaml files to write, and people will just copy what's already there. If your first k8s job mounts host /tmp directory just because you were testing something and forgot to delete the line, soon you will have fifty jobs all mounting host /tmp directory. Good luck figuring out which job actually needs it then.

This piece of advice is so underrated, it's hard to put into words. Every senior developer/architect should know that.

4. Centralized logging is the basics in any company. A container simply logs to stdout (kubernetes) and a fluentd/logstash agent can forward it.

7. Sadly if existing software can't run in kubernetes, this severely limit the benefits of having kubernetes, why use something that can't be used? If the jobs already have service discovery, they might be better off running on hostNetwork or whatever allow them to work as is.

Re: 7, if you use a separate discovery like zookeeper, then it will have its own idea of which jobs are alive and ready, and k8s will have its own idea of the same thing - they may not necessarily agree with each other. They don't even know the existence of each other. Maybe it would work, but it seems like more hassle than rewriting the offending part.

From what I've seen, Kubernetes is a quintessential Google product - it has a very particular idea of how jobs should be run, and the farther you stray away from it, the more it will cost your sanity. If that curtails the benefits of k8s for you, then yes, I think you should revisit whether you really need k8s, based on that limitation.

Just my two cents.

If we're talking about zookeeper for service discovery. The registering application can maintain a connection to zookeeper and ping every few seconds. So zookeeper has a very good idea of what's running or not, usually more accurate than kubernetes.

Actually, zookeeper has API to register only when ready and to listen to events/changes. Not sure kubernetes has equivalent stable API so might be hard to port over.

A rewrite might be a solution, except it's not because it's doomed to fail. We're discussing service discovery, which implies multiple clients and servers, probably managed by different teams and written in different languages. The odds of completing a coordinated rewrite effort are abysmal. ^^

Well. I am thinking out loud. It's a real problem my company was facing. We've got kubernetes clusters that are supposed to run applications and we've got apps using zookeeper that can't run in kubernetes because it breaks the service discovery. I will eventually have to hint people how to make it happen, after one year of kubernetes hardly going anywhere.

These are all excellent ideas and honestly better than the article. You're being too modest.

We solve #2 (and more) by having a highly restrictive default PodSecurityPolicy. We started with a combination of GCE and OpenShift default examples (although we use neither) which are published on github. PSP lets admins and security relax a little at night.

Anything #3 is solved by setting highly restrictive LimitRanges in each namespace that must be overridden in the deployment specs.

It's not mandatory that you override the defaults, but if you don't you're going to get poor performance. If you're fine with that the admins are also fine with that.

Sometimes some occasional throttling during spikes is totally acceptable - it all just depends on whether the app actually needs maximum performance. There are many other apps in the cluster that DO, and in the grand scheme of things they benefit by having the ones that DON'T get throttled.

Apparently Borg handles CPU differently (and better) than k8s in the multitenancy model, but this is the best "poor man's" borg I can come up with and it works for us.