Hacker News new | ask | show | jobs
by cookiecaper 2973 days ago
> You may be familiar with Operators from the concept’s introduction in 2016. An Operator is a method of packaging, deploying and managing a Kubernetes application.

"Operators", as introduced in 2016, were just bespoke Go programs that communicated with Kubernetes internals in a pretty low-level way.

You were writing special-case plugins for Kubernetes, but they didn't want to make it sound that way, because I guess that just doesn't sound hip or devopsy. This branding exercise worked out for CoreOS -- Red Hat just bought them.

This whole space is massively infused with bullshit. It's because all of these companies want to make money selling you cloud stuff, because it's profitable to rent computers at 3-5x the TCO. Google especially is hungry to claw back the lead in the cloud space from Amazon, and it's not hard to conceive why Kubernetes doesn't seem to work without fuss anywhere except GKE, or to understand the massive marketing dollars that Google is pumping into this whole Kubernetes farce (and for the record, Google seems to consider HN an important platform for k8s PR; I've been censured after too many Googlers found my k8s-skeptical posts "tedious").

Anyway, I guess that's neither here nor there. Just annoyed at what is by now the totally conventional status quo of overhyped empty promises made by people who seem more like ignorant promoters and fanboys than serious engineers.

This "Operator Framework" seems to be the same concept of Operators, just with additional library support for the plugins -- err, "Operators". It may be a good improvement, will have to research more.

4 comments

Operators are just a pattern, not a technology.

I think they chose "operator" over the more traditional "controller" because the latter can be quite simple, whereas an operator is potentially a combination of several things, including CRDs, API extensions, and controllers. For example, an operator might start different controllers depending on what cloud it's deploying to. It's a useful distinction; if someone says "I'm using this operator for X", I instantly know what they mean.

FWIW, I'm one of those who remember your name, simply because you pop up in every Kubernetes discussion with a predictably contrarian, long-winded opinion. I don't know what you're getting out of it. In this case, you're not wrong — but the curmudgeonly, somewhat tone deaf way that you go about it isn't very nice, which probably explains the downvotes.

>FWIW, I'm one of those who remember your name, simply because you pop up in every Kubernetes discussion with a predictably contrarian, long-winded opinion.

There are several k8s conversations on HN each day. I skip most of them.

> I don't know what you're getting out of it.

I get a lot of useful feedback out of it. Most of the posts I make on HN are about trying something out and gauging the response, because I want to learn from it. Sometimes I get a complete correction, which is great, because I stop believing something that's wrong. More often I get minor shifts in my personal POV, perspective on what arguments are effective and which aren't, the pitfalls/commmon counterarguments to specific positions, good feedback on tone / interpretation, and lots of other valuable information. Also, once in a while I make a nice personal connection.

I'll note for the record that response on k8s/containers is mixed. There are certainly a substantial number of threads where I'm at the bottom, but there are also threads where I make essentially the same arguments and score pretty well, along with a few supporting comments from people who say they don't get it either. HN's responses generally seem to be signaled by the tone of the thread and headline, and the preponderance of existing responses. If the groundwork is laid with a positive outlook, negative comments will usually have a hard time, and vice-versa.

Also, my arguments are usually not purely repetitive, even if they have the same core message (because same core things remain relevant). I have never talked about Operators on HN before. They came up and they're a good example of how people pretend that Kubernetes is more production-ready than it is, by obfuscating things like "you have to write special programs to teach Kubernetes how to deploy and manage your applications because the YAML configuration interface they tout isn't good for complex cases", behind the much hipper "Build a Kubernetes Operator, then you'll be cloudified and Dockerized out the wazoo!!"

I admit that I find a culture focused on this type of hype to be grating and immature, and as a sign of its inability to really bring substantial improvement to the table. I don't think I feel this way about anything new in general, I just think it's a reaction to a progressively-worsening engineering deficit in the "devops" field. I hope that I can learn whether this is right or wrong as time goes on.

> In this case, you're not wrong — but the curmudgeonly, somewhat tone deaf way that you go about it isn't very nice, which probably explains the downvotes.

Yeah, so this is a great example of why I continue to post about this. Most people would consider this subject matter very dry, but Kubernetes is something that people imbue with much more personal identification than is typical for infrastructure orchestration projects. Where are the CloudFoundry Diego disciples (paging jacques_chester ;) )?

It's important to post and learn the pressure points, and if there is any argument or circumvention that is effective against that identity imprint. I'm still trying to learn, so I continue to post and draw feedback from the community. I appreciate your participation in teaching me thus far. :)

Your comment would be more useful if it shared any hands-on experience with Operators instead of the usual ranting. I think you've made your general opinion on containers and Kubernetes pretty clear already.
First, I don't think anyone notices or cares when I post specifically, so it's hard to really feel like my "general opinion" is well-known enough to not talk about it anymore. I'm no Joel Spolsky over here!

Second, I feel like it's valid to point out that Operators are not really just a method of "packaging", in a post that tries to make it sound like Operators are just a small bit of YAML or metadata. You're writing real, non-trivial Go code that tells Kubernetes explicitly how to deploy and manage the lifecycles of specific types of applications.

At least until now with the "Operator Framework", there wasn't really even anything that firmly defined an Operator as an Operator; it's just what some people called their Go code that manipulated k8s's object handling and lifecycle internals.

But, if you insist, here's one operator I've worked with: https://github.com/coreos/prometheus-operator . This is from CoreOS itself.

Here's a patch I submitted about a year ago: https://github.com/coreos/prometheus-operator/pull/289 . This required updating the way the software handled HTTP response codes in one of its "watcher" daemons (because all packaging methods need those, right?), and fixing the order of operations in the bootstrap scripts.

Some more general info about this repo:

    $ du -sh prometheus-operator/.git
        51M     prometheus-operator/.git
The repo size is 51M.

    $ git rev-list --count master
        1716
There have been almost 2000 commits.

    $ cloc --vcs=git --exclude-dir vendor,example,contrib .
         290 text files.
         278 unique files.
         121 files ignored.
    
    github.com/AlDanial/cloc v 1.74  T=0.82 s (295.4 files/s, 48117.2 lines/s)
    -------------------------------------------------------------------------------
    Language                     files          blank        comment           code
    -------------------------------------------------------------------------------
    Go                              50           1581           1392          20622
    JSON                             9              0              0           6276
    YAML                           132            260            885           4164
    Markdown                        30            792              0           2957
    Bourne Shell                    17             65             58            257
    make                             1             34              2             91
    Python                           1             10              5             40
    TOML                             1             11             20             30
    Dockerfile                       2              8              0             27
    -------------------------------------------------------------------------------
    SUM:                           243           2761           2362          34464
    -------------------------------------------------------------------------------
It appears there are over 20k lines of Go code after excluding vendor libraries and the example and contrib directories (arguably, contrib should've been included).

I dunno, it just feels a little disingenuous, to me, to say that something that involves this much code is just a "packaging method" for a normal application. "Sure, just write an operator to package that up" like it's comparable to a package.json manifest or something. It's not! You need custom daemons that watch files to make sure that your k8s deployment stays in sync, and then you need to exert very meticulous and specific control over Kubernetes' behavior to make things work well.

I think it's demonstrative that it takes north of 20k lines of Go code to package an application for deployment on Kubernetes. What do you think?

-------------

EDIT: And one clarification: my opinion on containers as such is probably not well-known, since you're conflating it with my opinion on Kubernetes.

I like containers conceptually (who wouldn't?) and I run several of them through LXC:

    NAME               STATE   AUTOSTART GROUPS IPV4            IPV6 UNPRIVILEGED 
    axxxx-dev          STOPPED 0         -      -               -    false        
    gentoo-encoder     STOPPED 0         -      -               -    false        
    jeff-arch-base-lxc STOPPED 0         -      -               -    false        
    jeff-crypto        RUNNING 0         -      xxx.xxx.xx.xxx  -    false        
    jeff-ffmpeg        STOPPED 0         -      -               -    false        
    jeff-netsec        STOPPED 0         -      -               -    false        
    jeff-ocr           STOPPED 0         -      -               -    false        
    localtorrents-lxc  RUNNING 0         -      xxx.xxx.xx.xxx  -    false        
    nim-dev            STOPPED 0         -      -               -    false        
    plex-2018          RUNNING 1         -      xxx.xxx.xxx.xxx -    true         
    unifi              STOPPED 0         -      -               -    true    
I believe this is the kind of thing people actually want. Highly efficient, thin "VMs" that are easy to manage and run as independent systems without requiring the resource commitment.

There is a good place for Kubernetes in probably about 1% of deployments where it's used. Most other people are just trying to run something like LXC, but they're confused because everyone who is critical of k8s drops to -4 and gets HN's mods after them. :)

> It appears there are over 20k lines of Go code after excluding vendor libraries and the example and contrib directories (arguably, contrib should've been included).

close..

  $ find -name '*.go'|grep -v ./vendor|xargs wc -l|sort -n|tail
     507 ./test/e2e/alertmanager_test.go
     540 ./pkg/client/monitoring/v1/types.go
     562 ./pkg/alertmanager/operator.go
     643 ./pkg/prometheus/statefulset.go
     719 ./pkg/prometheus/promcfg.go
     760 ./test/e2e/prometheus_test.go
     835 ./pkg/client/monitoring/v1/zz_generated.deepcopy.go
    1152 ./pkg/prometheus/operator.go
   11410 ./pkg/client/monitoring/v1/openapi_generated.go
   24526 total
Deleting the auto generate api files and ignoring test/ gives

  -------------------------------------------------------------------------------
  Language                     files          blank        comment           code
  -------------------------------------------------------------------------------
  JSON                             9              0              0           6276
  Go                              31           1020           1065           6028
which is quite a bit off from 20k.

> I dunno, it just feels a little disingenuous, to me, to say that something that involves this much code is just a "packaging method" for a normal application.

What is disingenuous is to call the prometheus operator, that deploys an entire monitoring stack a "normal application" The operator sets up monitoring on all the nodes, runs all the server components including configuring grafana and setting up dashboards.

Meanwhile,

  $ cloc puppet-prometheus-5.0.0 puppet-grafana-4.2.0
     192 text files.
     157 unique files.
      48 files ignored.
  
  github.com/AlDanial/cloc v 1.76  T=1.09 s (131.9 files/s, 10534.0 lines/s)
  --------------------------------------------------------------------------------
  Language                      files          blank        comment           code
  --------------------------------------------------------------------------------
  Ruby                             43            425            108           2477
  Puppet                           32            132           1747           2280
  Markdown                          6            494              0           1201
  ERB                              20            105              0            925
  YAML                             35              0            200            459
  JSON                              4              0              0            329
  Bourne Shell                      2             28            130            198
  Bourne Again Shell                2             28             56            176
  --------------------------------------------------------------------------------
  SUM:                            144           1212           2241           8045
  --------------------------------------------------------------------------------
Which is that large and doesn't do ANY of the things that the prometheus operator does.

> I believe this is the kind of thing people actually want. Highly efficient, thin "VMs" that are easy to manage and run as independent systems without requiring the resource commitment.

Or, people want to use k8s so they can run entire clusters of machines as a single consistent system and take advantage of things like rolling deployments and self healing applications.

> which is quite a bit off from 20k.

True. I certainly admit that I don't know the code well enough to know exactly which lines or files are critical and which aren't, but I think it's getting into the weeds to nitpick the specifics too much (for example, I would argue that while it's probably good to exclude generated files, you shouldn't exclude the tests from the line count since they require real human time to maintain).

The point is that thousands of lines of unique, application-specific code are required to create an "operator" that runs Prometheus within k8s. This is not what most people think of when someone says "a method for packaging".

> What is disingenuous is to call the prometheus operator, that deploys an entire monitoring stack a "normal application"

While Prometheus is certainly a big system in its own right, I don't think that necessarily makes it a bad representative. Many people are planning to port their own complicated systems to Kubernetes.

> Or, people want to use k8s so they can run entire clusters of machines as a single consistent system and take advantage of things like rolling deployments and self healing applications.

It's hard to talk about this because there is so much wrapped up into the gob that is k8s, and of course not all of it is bad. But we've had "rolling deployments" and "self-healing applications" before, without having to write 10k+ lines of code to manage the platform deployment. These aren't a new thing to k8s.

k8s provides a platform that gives a nomenclature to them, but it's not always clear that there is a benefit to running on that platform v. running more traditional setups, especially when you consider that you still have to configure and code your (k8s-internal) load balancers, web servers, and applications to handle these things.

There's no free lunch. Kubernetes is a container orchestrator. It automates system-level commands like "docker run ..." and provides a (mostly redundant) fabric for those containers to feed into. That's great and there are some people who really need that, but far too many people read comments like yours and interpret it to mean "If I use Kubernetes I will have self-healing applications". It doesn't work that way.

You just said you don't understand the code well enough to discern which files matter and don't, then go to say it takes thousands of lines of code.

If you remove the generated code, and discount the tests, its barely a thousand. Much of it is the test code, and Go is fairly verbose for testing code.

It generally maybe takes a few hundred lines of code to write simple to moderately complex operators. A lot if it's generated and boilerplate. I would say a lot of that's due to lack of generics in Go, but I wouldn't say it's very much code overall. Additionally, the framework being presented here aims to reduce that down by removing the boilerplate, and making it easier to express the end goal (eg: self-healing, auto-rebalancing, etc) using less code.

It's certainly not much more code to implement an operator than what you would see in a well written Puppet module/Chef cookbook/Ansible playbook, and it does a lot more. You certainly could try to do self-healing using these tools, but it's significantly more difficult in my experience.

I agree that there's no free lunch and that you won't just necessarily get self-healing applications by using Kubernetes. But it's certainly easier to build them when using Kubernetes. The only thing that's really changed is instead of writing to a particular cloud provider API to handle this, you're able to leverage something more agnostic to the specific cloud/vendor you're using for your infrastructure.

> But we've had "rolling deployments" and "self-healing applications" before, without having to write 10k+ lines of code to manage the platform deployment

You continue to be disingenuous and imply that every application requires 10k lines of code to run on k8s.

I recently used k8s to deploy an application. To configure 2 services with exterior and interior load balancing and health checks for rolling deployments and self healing took 115 lines of yaml, maybe 40 of which was specific to my application.

115 lines. Not 10,000+

Then once things were working I created a 2nd namespace for production and deployed an entire 2nd copy of everything. This took me 10 minutes and 2 kubectl commands.

> "If I use Kubernetes I will have self-healing applications". It doesn't work that way.

But that's exactly how it worked. I wrote 115 lines of yaml and had multiple environments, load balancers, health checks, and rolling deployments.

I know how to do this using "traditional setups", and I know it takes a lot more than 115 lines of generic yaml.

> You continue to be disingenuous and imply that every application requires 10k lines of code to run on k8s.

Let me clarify and state unambiguously that it won't necessarily take 10k lines of code to run any random application on Kubernetes.

You can, in fact, deploy Prometheus without using the Prometheus operator and you'll technically be "running your monitoring" within k8s. It just isn't likely to be very reliable or useful. :)

> But that's exactly how it worked. I wrote 115 lines of yaml and had multiple environments, load balancers, health checks, and rolling deployments.

If you already had a fully "stateless", self-healing capable application running on not-k8s, and your layout is as simplistic as "2 services with load balancers", you can probably move to Kubernetes with a comparatively small amount of fuss. If your existing setup was pretty tiny, this may have been a worthwhile project.

If you didn't already have a stateless, self-healing-capable system, and you didn't change your application to accommodate it as part of the port, then regardless of what Kubernetes reports about your pod state, you don't have a self-healing application.

The barrier between application and platform is artificial. They must work together. It's sort of a convenient fantasy that you can try to demarcate these areas. You can't just take any random thing and throw it on Kubernetes and say it's all good now because you can watch k8s cycle your pods.

Maybe you think this is implicit, but as someone who has spent the last 2.5 years building out k8s clusters for software written by average developers, I can assure you that there are a great deal of people who aren't getting this message.

I went full-time freelance about a month ago. One of the last in-house k8s services I deployed, the guy told me, "Oh yeah, we can't run more than one instance of this, or it will delete everything." Yet, these people are very proud of the "crazy scalability" they get from running on Kubernetes. Hope the next guy reads the comments and doesn't nudge that replicas field!

If you already had a non-trivial system that worked well for failover, recovery, self-healing, etc., why'd you replace it with something that is, for example, still just barely learning how to communicate reliably with non-network-attached-storage, as a beta feature in 1.10 [0], released last month? There are many things that sysadmins take for granted that don't really work well within k8s.

I accept that at first glance and with superficial projects, it can be easy to throw the thing over the fence and let k8s's defaults deal with everything. This is definitely the model and the demographic that Google has been pursuing. But if you have something more serious going on, you still have to dig into the internals of nginx and haproxy within your k8s cluster. You still have to deal with DNS. You have to deal with all the normal stuff that is used in network operations, but now, you're just dealing with a weirdly-shaped vaguely-YAMLish version of it, within the Great Googly Hall of Mirrors.

Once you do that enough, you say "Well, why am I not just doing this through real DNS, real haproxy, real nginx, like we used to do? Why am I adding this extra layer of complication to everything, including the application code that has to be adapted for Kubernetes-specific restrictions, and for which I must write <INSERT_ACCEPTABLE_LINE_NO_HERE> lines of code as an operator to ensure proper lifecycle behavior?"

Most people aren't willing to give themselves an honest answer to that question, partially because they don't really ask it. They just write some YAML and throw their code over the fence, now naively assured that the system is "self-healing". Then they get on HN and blast anyone who dares to question that experience.

[0] https://github.com/kubernetes/features/issues/121

I'm sorry you've been down voted. FWIW, I've asked you several times for feedback, but you've never responded. We'd love to be better!

Aronchick (at) google.com

Disclosure: I work at Google on Kubeflow

Minor correction: I've replied a couple of times on HN and gotten no response. You once clarified that this is because you don't check HN often enough to reply before the deadline, which is fine. I do admit that you once sent me an email and that I didn't reply to it.
cookiecaper covers his lack of response to k8s threads in his bio at https://news.ycombinator.com/user?id=cookiecaper. There is an email if you want to contact him directly.
In a world where we are trying to codify more and more of the operational side of our applications, I personally look at operators as a better alternative to the configuration management systems we use to configure generic operating systems today.

We're moving away from an imperative configuration/operational model to a declarative one. While these operators target applicatings running in k8s I could imagine them being created to manage applications running elsewhere as well.

I agree with your conclusions that the world is becoming more declarative and that we need to converge on an agnostic system to make these declarations.

However, you might've been trained incorrectly, perhaps as a joke, because you've assigned properties to the opposite solutions!

Configuration management like Ansible or Salt are agnostic modules that can be plumbed under the hood to work with any platform: use Ansible's "copy" module and it can be implemented under the hood as cp, rsync, passenger pigeon, whatever. You can use the same playbook against any target, including targets that run on orchestration platforms like Kubernetes.

Kubernetes Operators, on the other hand, are tightly linked to Kubernetes internals. They are not declarative; they require explicit instruction on how to manage your application's lifecycle, implemented in a statically-typed programming language. Indeed, if your system doesn't need to muck around in the Kubernetes internals, you don't create an "Operator", you just use the pre-baked object types like Service, Deployment, etc.

So I agree in principle, but Kubernetes is the opposite of what you've expressed here.