Hacker News new | ask | show | jobs
by remram 1204 days ago
An operator operates something, e.g. it actively makes changes. If you want to deploy an application, a Helm Chart is the correct way. It will allow you to have deterministic deployment, that you can duplicate multiple times in your cluster, and you can dry-run it and see the generated manifests.

An operator is needed when you can't just deploy and forget about it. An example is the Prometheus operator, which will track annotations created by users to configure the scraping configuration of your Prometheus instances. Another example is cert-manager, which gets certificates into secrets based on Certificate and Ingress objects, renews them automatically before expiry, and does that by creating ingresses picked up by your ingress controller.

The advantage of an operator is that it will react to stuff happening in the cluster. The drawback is that it reacts to stuff happening, potentially doing unexpected things because changes happen at any time and you can't dry-run them. Another drawback is that they are usually global, so you can't run multiple versions at the same time for different namespaces (mainly because custom resource definitions are global).

Unfortunately many people think packaging an application = creating an operator, and that operator does nothing a chart couldn't do.

3 comments

The CockRoach DB example in the article is a perfect example of an unnecessary CRD. Acquiring certificates within an Kubernetes cluster is a common requirement for lots of applications and there are lots of solutions out there. Is it really necessary to spend time writing your own operator? Now you have a second helm chart and an operator to maintain. Now you have to explain to people which chart to use. You could get rid of the non-operator chart but now I have operators within the cluster acquiring certificates in 5 or 6 different ways. Do I have to configure the credentials for 6 operators so they can make Route53 DNS challenge records?

Edit: maybe we could shift left and ask the app developers to add certificate acquisition directly into the app source.

> Do I have to configure the credentials for 6 operators so they can make Route53 DNS challenge records?

A certificate for service to service communication does not have to correspond to a public endpoint.

That’s true. The actual process of granting a pod access to edit route53 TXT records is pretty easy.

The problem is duplication of functionality within the cluster and increased complexity in the configuration of deployments. I want to configure a certificate acquisition process once, and plug those certificates into pods using a generic process based on annotations. I don’t want to configure it 6 times because each operator does things slightly differently. Each CRD introduces new complexity to the cluster. More opportunities for things to break. More stuff to read and learn about.

I think CRDs should be used to extend the platform functionality, not as a tool to simplify deployments. It feels a little like reworking the plumbing in your house to make the sink easier to install instead of using a wrench.

> that operator does nothing a chart couldn't do.

Or is can be actively harmful when they don't do any error checking whatsoever, causing it to be less accurate that `helm template` would be. Related, it's also one more thing to monitor because it can decide to start vomiting errors for whatever random reason

Neither of those cases really need an operator -- Prometheus and cert-manager both have code that watches for changes on ingresses/services/custom resources and reacts to changes (using permissions granted via RBAC). I've used both without an operator and still use Prometheus without one.
Everyone else commenting is missing the point, because as you've said, Prometheus can discover targets via k8s endpoints/pods dynamically already, and you can run a sidecar to reload the config file.

The main point of Prometheus operator is to federate access to Prometheus configuration so teams can manage configuration that the built-in Kubernetes service discovery doesn't let you control from an application side. Things like scrape interval, recording rules and alerts, etc. ServiceMonitors, PrometheusRule CRDs basically let app owners ship these aspects of monitoring with their applications, instead of having to have a hand-off between the service owners and the SRE teams managing the monitoring infrastructure.

Pronetheus does not change its config if not asked to do it… and in most cases scrape config is stored in a configmap that does not change by itself magically.

So yes -> you need an operator to dynamically change config of prometheus. If not -> all your changes wont matter because they will be not reflected in prom.

There are several Prometheus helm charts out there. The one I use runs a sidecar container that watches for config changes and tells Prometheus to reload the config. So you don't need the operator - there are other approaches.
But that configmap does not change by itself.

The same with annotations approach - its very limited.

Wdym the configmap doesn't change by itself? I don't want my configurations changing "magically" without my say so

The two features I need (and can get without the provider) are: 1. If I (manually, explicitly) change the configmap, prometheus can pick it up without being restarted. This is provided by the side-car container. 2. If new pods come online they are automatically detected and scraped by Prometheus. As long as they have the annotations saying: a) this pod should be scraped, b) scrape on this port, and c) scrape at this URL - then they will be scraped.

I dunno what to tell ya, I see pods come and go from Prometheus all the time, including those from newly added Service objects with Prometheus annotations. I'm using kubernetes_sd_configs.
You don't NEED an operator, but there are benefits.

If all of your pods have the same scrape settings, there's not much benefit to the operator. But if each pod or set of pods needs custom scrape settings - say different scrape intervals or custom tag re-writes, the operator will let you define each set of scrape configs separately as kubernetes resources. You can store the definitions alongside the resources they scrape instead of having a single large complex config in your prometheus deployement. This would be especially beneficial if you have multiple projects owned by multiple teams, all scraped by the same Prometheus.

Granted I also don't use the operator, but I've looked into it. With complex enough deployments it would simplify things.