Hacker News new | ask | show | jobs
by anoncoward1234 2967 days ago
The problem with adding the Ingress controller via Helm (and with a lot of other Kubernetes abstractions) is that it spits out a lot of code that is then difficult or impossible to reason about. `Helm Ingress --whateversyntaxdefualt` spits out 1000+ lines of Ingress controller code that is essentially two deployments with a health check and auto spin up, but it's complicated. In production can I use this or is there a security hole in there? What if the ports the health check are using overlap with other ports I have assigned somewhere else? What if something equally silly?

Maybe Kubernetes is new so that's why it's so wild west, but it really feels like a pile of bandaids right now.

4 comments

I have read through the nginx ingress controller code in Helm before deploying it into production.

What you're saying is pretty much the result of my biggest gripe with Kubernetes, though it's one I don't have a lot of ideas of how to fix; there's too much damn boilerplate. 1000 lines of YAML to store maybe 100 relevant lines.

That being said, can you trust that there is not a security vulnerability when you deploy i.e. NGINX alone? Your answer should not be yes. Even if you read through every single line of configuration and understand it, it doesn't mean something isn't wrong. Google "nginx php vulnerability" for an example of what I mean; innocent, simple configuration was wrong.

I read the Helm chart for nginx ingress because I wanted to understand what it was doing. But did I have to? Not really. I trust that the Helm charts stable folder is going to contain an application that roughly works as described, and that I can simply pass configuration in. If I want to be very secure, I'm going to have to dig way, way deeper than just the Kubernetes manifests, unfortunately. There's got to be some code configuring Nginx in the background, and that's not even part of the Helm chart.

> What you're saying is pretty much the result of my biggest gripe with Kubernetes, though it's one I don't have a lot of ideas of how to fix; there's too much damn boilerplate. 1000 lines of YAML to store maybe 100 relevant lines.

I think that's more a helm issue than a k8s issue. I've been using helm in production for over a year and k8s for almost three years. Prior to adopting helm we rolled our own yaml templates and had scripts to update them with deploy-time values. We wanted to get on the "standard k8s package manager" train so we moved everything to helm. As a template engine it's just fine: takes values and sticks them in the right places, which is obv not rocket science. The issues come from its attempt to be a "package manager" and provide stable charts that you can just download and install and hey presto you have a thing. As a contributor to the stable chart repo I get the idea, but in practice what you end up doing is replacing a simple declarative config with tons of conditionally rendered yaml, plug-in snippets and really horrible naming, all of which is intended to provide an api to that original, fairly simple declarative config. Add to that the statefulness of tiller and having to adopt and manage a whole new abstraction in the form of "releases." At this point I'm longing to go back to a simpler system that just lets us manage our templates, and may try ksonnet at some point soon.

The stable chart thing is so weird. Internally use we some abstractions, but I looks at stable charts and it requires so much time just to understand all of what's going on. Everything is a variable pointed to values, and you can't reason about any of it.

It seems like the hope is, just ignore it all, and the docs are good, and just follow them, but I don't live in any kind of world I can do that.

And the commits, and the direction of all of them seem to go more and more impossible to read conditionally rendered symbols.

I've had such a challenge understanding and using helm well enough. Small gotchas everywhere that can just eat up tons of time. This doesn't feel like the end state to me.

> It seems like the hope is, just ignore it all, and the docs are good, and just follow them, but I don't live in any kind of world I can do that.

Yep, agreed, we've used very few charts from stable, and in some cases where we have we needed to fork and change them, which is its own special form of suck. The one I contributed was relatively straightforward: a deployment, service and a configMap to parameterize and mount the conf file in the container at start. Even so I found it a challenge to structure the yaml in such a way that the configuration could expose the full flexibility of the binary, and in the end I didn't come anywhere near that goal. You take something like a chart for elasticsearch or redis and its just so much more complicated than that.

Right, I'm in particular working on charts for ELK, and it's just a mess. I just took down all my data (in staging, so all good) due to a PVC. The charts won't update without deleting them when particular parts of the chart change, but if you delete them, you lose your PVC data.

So I find the note in an issue somewhere stating, this is.. intentional?.. and that of course you need some annotation that will change it.

Let alone the number of things like, xpack, plugins, the fact that java caches the DNS so endpoints don't work on logstash, on and on.

It seems like everyone is saying operators are going to be the magical way to solve this, but if anything it seems like one set of codified values, that don't address any of the complexity.

You're using a statefulset? Here's a tip: you can delete a statefulset without deleting the pods with `kubectl delete statefulset mystatefulset --cascade=false`. The pods will remain running, but will no longer be managed by a controller. You can then alter and recreate the statefulset and as long as the selector still selects those pods the new statefulset will adopt them. If you then need to update the pods you can delete them one at a time without disturbing the persistent volume claims, and the controller will recreate them.
The Kubernetes creators never intended this verbose YAML format to be the long-term format for humans to work with directly. Heptio's ksonnet is where they want to go: https://ksonnet.io

No, this is not replacing the YAML under the hood, it's just more convenient for humans as a higher layer.

I found ksonnet, by actually looking for smarter json, jsonnet (which ksonnet is based on). Had little experience, while at Google with borgcfg - and while not the same, it's very similliar in spirit, even has easier to understand evaluation rules (unlike borgcfg, which I could never get fully, or I would understand them when focusing, and then if I haven't used them in a while would completely forget again).
> In production can I use this or is there a security hole in there?

What if there's a bug in nginx? That has a lot more lines of code than the controller code. As always, feel free to audit the code, but as with any environment to eventually have to trust someone's code.

> What if the ports the health check are using overlap with other ports I have assigned somewhere else?

Each container can bind to every port, only those that are exposed can conflict. (Similar to how docker works).

Honestly, kubernetes might not solve your use case. I use it because it solves mine (Self-healing, declarative configuring that works seamlessly across multiple nodes - aka accessing multiple nodes as one big computer).

You should not use Ingress. Use Nginx or Haproxy and do it on K8S like you would do it normally and you can scale your nginx haproxy with kubectl scale --replicas=2 deploy nginx

On the outside use metallb which than gets you a single IP which is highly available either via L2 or with bgp (if you have bgp gear) if you are not on the cloud. What people do wrong with k8s is that they think different, which is silly. k8s just exposes a "managed vm" where you can built stuff like you would do on vmware vApps.

I disagree with two statements:

> You should not use Ingress

Why not? It allows you to route your applications automagically with Kubernetes objects. Instead of writing nginx configurations that do what you want, you can just describe how you want your routing to work. I don't see why that isn't useful.

> k8s just exposes a "managed vm" where you can built stuff like you would do on vmware vApps.

Pods aren't even containers, less VMs. They're namespaces with containers in them.

Secondly, while you can use those pods like VM and boot systemd or whatever in them, that's not really the way you're intended to use Docker. Just to quote an official source:

https://docs.docker.com/config/containers/multi-service_cont...

> It is generally recommended that you separate areas of concern by using one service per container.

Instead of treating Kubernetes like a VM manager, the actual intended way to use it is to treat it like a task manager, like systemd or what have you. The pods are meant to represent individual services, and containers individual processes.

The problem Kubernetes solves is managing applications, not machines. The difference is not merely semantic rambling; it's a paradigm shift.

You have to accept that Kubernetes is a platform, and any platform, no matter how simple or complex, will come with its own set of technical challenges. Complexity isn't in itself isn't an evil. Unix is complex.

Just imagine the complexity of something like APT on Debian/Ubuntu, or RPM on Red Hat/Centos. You could run into a problem with installing a package with apt-get or yum, perhaps some configuration script written in Bash that misbehaves during installation. To fix it, you have to understand how it's put together. The same applies to Kubernetes. You have to know the layers in order to work with them. Someone who doesn't know shell scripts or how init scripts work will not be able to work on Unix. Kubernetes is kind of like an operating system in the sense that it's a self-contained abstraction over something lower-level; the complexity of Unix isn't different, it's just that the design and implementation different.

Helm "just" installs parameterized YAML manifests. But Helm doesn't pretend to be an abstraction that simplifies Kubernetes. It simplifies the chore of interacting with Kubernetes, but in order to really use Helm, you have to understand what it is doing. Specifically, you do have to understand the "1000+ lines" of ingress declaration that it spits out. The notion that you can get around the complexity of Kubernetes with Helm is simply false.

To start with Kubernetes, take a step back, forget about Helm, and simply use Kubectl. You can accomplish everything absolutely you need with "kubectl apply -f". Learn each basic building block and how all of them fit together. Learn about pods before you learn about anything else. Deployments build on pods and are the next step. Then learn about services, configmaps and secrets. These are all the primitives you need to run stuff.

Ingresses are arguably the worst part of Kubernetes, since it's a pure declarative abstraction — unlike pods, for example, an ingress doesn't say anything about how to serve the ingress, it just expresses the end goal (i.e. that some paths on some hosts should be handled by some services). Ingress controllers are probably mysterious to beginners because they're an example of a "factory" type object: An ingress controller will read an ingress and then orchestrate the necessary wiring to achieve the end goal of the ingress.

Moreover, you don't need ingresses. Ingresses were invented a little prematurely (in my opinion) as a convenience to map services to HTTP endpoints have make these settings portable across clouds, but what most people don't tell you is that you can just run a web server with proxying capabilities, such as Nginx. This gist [1], which can be applied with "kubectl apply -f nginx.yml", describes a Nginx pod that will forward /service1 and /service2 to two services named service1 and service2, and will respond on a node port within the cluster (use "kubectl describe endpoints nginx" to see the IP and port). Assuming a vanilla Kubernetes install, it will work.

[1] https://gist.github.com/atombender/af2710818af0921e5c55a9ecb...