Hacker News new | ask | show | jobs
by icythere 1810 days ago
One of the issues I've often seen that my team mates send "right command" to wrong cluster and context. We have a bunch of clusters and it's always surprising to see some laptop deployments on ... production cluster.

So I wrote this https://github.com/icy/gk8s#seriously-why-dont-just-use-kube... It doesn't come with any autocompletion by default, but it's a robust way to deal with multiple clusters. Hope this helps.

Edit: Fix typo err0rs

13 comments

I'm lazy and I don't like having to remember "the right way" to run something, so my solution is directories and wrappers. I keep a directory for every environment (dev, stage, prod, etc) for every account I manage.

  env/
      account-a/
                dev/
                stage/
                prod/
      account-b/
                dev/
                stage/
                prod/
I keep config files in each directory. I call a wrapper script, cicd.sh, to run certain commands for me. When I want to deploy to stage in account-b, I just do:

  ~ $ cd env/account-b/stage/
  ~/env/account-b/stage $ cicd.sh deploy
  cicd.sh: Deploying to account-b/stage ...
The script runs ../../../modules/deploy/main.sh and passes in configs from the current directory ("stage") and the previous directory ("account-b"). Those configs are hard-coded with all the correct variables. It's impossible for me to deploy the wrong thing to the wrong place, as long as I'm in the right directory.

I use this model to manage everything (infrastructure, services, builds, etc). This has saved my bacon a couple times; I might have my AWS credentials set up for one account (export AWS_PROFILE=prod) but trying to deploy nonprod, and the deploy immediately fails because the configs had hard-coded values that didn't match my environment.

Very interesting solution to the problem. Pretty much everyone has their $PS1 set to show the current working directory, because the desire to know the implicit context of our commands ($PWD) has existed since the dawn of computing. Since then, we've added a lot of commands that have an implicit context, but we haven't updated our tooling to support them. That's a big problem, but I like your solution -- make the kubernetes context depend on the working directory, which your shell already prints out for you before every command.

(If I were redoing this all from scratch, I would just have my interactive terminal show some status-information above the command after I typed "kubectl "; the context, etc. That way, you know at a glance, and you don't have to tie yourself to the filesystem. And, this could all be recorded in the history, perhaps with a versioned snapshot of the full configuration, so that when this shows up in your history 6 weeks later, you know exactly what you were doing.)

With that in mind, I do feel like the concept of an "environment" has been neglected by UI designers. I never know if I'm on production, staging, private preview, or what; either for my own software, or for other people's software. (For my own, I use "dark reader" and put staging in dark mode and production in unmodified mode. Sure confuses people when I share my screen or file bug reports, though. And, this only works if you have exactly two environments, which is fewer than I actually have. Sigh!)

I simply put my apps in different namespaces on dev/stage/prod/etc. That way a kubectl command run against the wrong cluster will fail naturally.
That's great idea. As long as you have that from the design , that's very cool. Moving existing infra to support the idea is just hard and quite a nightmare. In our new clusters, we apply that idea you've shared.
Agree, this is a huge pain point when dealing with multiple clusters. I wrote a wrapper for `kubectl` that displays the current context for `apply` & `delete` and prompts me to confirm the command. It's not perfect, but it's saved me a lot of trouble already — but encouraging other members of the team to have a similar setup is another story.

Here's the script (along with a bunch of extra utils): https://github.com/pch/dotfiles/blob/master/kubernetes/utils...

Very valuable script. Thanks for your sharing.
A lot of people like kubectx. Or specifying contexts. Personally I hate both approaches.

For the several dozen clusters that I manage, I have separate kubeconfig files for each and I use the --kubeconfig flag.

It's explicit and I have visual feedback in the command I run for the cluster I'm running against, by short name. No stupidly long contexts.

My approach was to have a default kubeconfig for dev/QA environments, and a separate for production. I had a quick wrapper script to use the prod config file - it would set the KUBECONFIG env car to use the prod file, and update my PS1 to be red, a clear differentiator that reminds me I'm pointed at prod.
Exactly! Having separate config is very easily. I had that support in my tool ;)
Not a perfect solution but I add a prompt signaling both my current namespace and cluster, along with some safeguards for any changes on our production environment. In practice I haven't deployed something wrongfully in production ever.

I use a custom written script but I've used this one in the past - its pretty nice.

https://github.com/jonmosco/kube-ps1/blob/master/kube-ps1.sh

I have a prompt display as well, but to my own dismay, earlier that year, I applied some QA config to a prod system. (It did not cause substantial harm, thankfully.) After that, I changed my prompt display so that names of productive regions are highlighted with red background. That seems to really help in situations of diminished attentiveness from what I can tell.
We partially resolve this by having different namespaces in each of our environments. Nothing is ever run in the 'default' namespace.

So if we think we're targeting the dev cluster and run 'kubectl -n dev-namespace delete deployment service-deployment' but our current context is actually pointing to prod then we trigger an error as there is no 'dev-namespace' in prod.

Obviously we can associate specific namespaces to contexts to traverse this safety net but it can help in some situations.

direnv is our magic sauce for this. We enforce that all devs store the current context in an environment variable (KUBECTL_CONTEXT), and define the appropriate kubectl alias to always use that variable as the current context. To do stuff in a cluster, cd into that cluster’s directory, and direnv will automatically set the correct context. I also change prompt colors based on the current context.

(This way, the worst you can do is re-apply some yaml that should’ve already been applied in that cluster anyway)

We also have a Makefile in every directory, where the default pseudo-target is the thing you want 99% of the time anyway: kustomize build | kubectl apply -f -

I wrote a convoluted tool for this problem which isolates kubectl environments in docker containers: https://github.com/forestgagnon/kparanoid.

This approach allows the convenience of short, context-free commands without compromising safety, because the context info in the shell prompt can be relied on, due to the isolation.

There are some things which don't work well inside a docker container (port-forwarding for example), but it does make it simple to have isolated shell history, specific kubectl versions, etc.

I like how you explain the problem of `PS1` (stateful vs stateless). I actually saw the problem before but only once.
This is really nice. I like that it doesn't munge the `kubectl` command itself.
When I was running the internal k8s clusters at a previous workplace, I simply got into the habit of compulsively running `kubectl config current-context` to check which one of the 50+ clusters I was currently connected to (designated test clusters for *playing with cluster infra", designated clusters for "devs playing around", designated prod clusters, with segregation between "batch-like" and "interactive" workloads, as we needed to treat the nodes differently in those, designated "run the CI/CD pipelines" clusters, as they needed different RBAC, ... and then duplicate between multiple data centres).
thanks for starting that thread, context is a major hurdle for beginners.

I myself am quite happy with the basics, but have an alias on k=kubectl and set-context that without argument displays the current-context. Before doing anything I rename or edit contexts in .kube/config to have a minimal amount of characters to type for the target ("proj-prod"). Using -l name= is another help in filtering, jsonpath and jq too.. as years ago with using the cli prompts with database products, building up muscle memory also gave me opportunity to grok the concepts at the same time.

After some attempts with different tooling, I came to like kubernetes for what it can do.

k9s is a fantastic tool. It's a CLI GUI written in go.
More of a TUI than a CLI. I love its presentation but it falls apart if you have about 1000 pods or more.
I used k9s before and that's an awesome tool. Tho it doesn't help when I want to send a command to my team mate and he just executes them on wrong cluster. It's the problem I want to solve
Create a User/Role for deleting (or whatever dangerous action) resources in prod cluster/namespace. Setup RBAC which allows your employees to impersonate as that user/role using kubectl --as. This way if you send your coworkers a command for dev environment and they try to run it in prod it will fail because they didn’t run kubectl as that impersonated user.
Totally agreed. This is the right way for many problems. Sometimes it's quite not possible to deploy the idea: In one of my past working spaces, everyone (even newbies) was provided with all _root_ privileges -- the idea was to help the team to learn from their mistakes (if any), and it's actually a great idea.
I'm glad to hear that this is a more common problem. When sharing kubectl commands, I always specify the --context flag explicitly so the person using it has to manually edit the context name to whatever they are using before running it.
That's definitely helpful. Some different applications support different options to switch context. For example, Helm uses `--kube-context`.
I like the spirit of this but for dealing with multiple clusters, kubectx is pretty standard, always returns highlighting where you are and we don't have to type in the cluster name in every command. Also avoiding "kubectl delete" seems such a narrow case, I can still delete with "k scale --replicas=0" and possibly many other ways; at this point you are better of with a real RBAC implementation.
isn't kubectx the problem, not the solution? You think you are in one context but you are actually in another. You wanted to tear down the dev deployments but you nuked the production ones instead.
Every member of our ops team has the following PS1 var in .bashrc to prevent such accidents:

  PS1='\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]\[\033[01;33m\] [`kubectl config current-context| rev | cut -d_ -f1 | rev`] \[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\] $ '
couldn't that be solved by not allowing production access to those clusters? most k8s providers should allow role based access (read/write/deploy)