Hacker News new | ask | show | jobs
by carlosrdrz 2817 days ago
I do agree with you, but I don't think I really missed the point of the original article. From the original article: > However popular wisdom would suggest that Kubernetes is an overly complex piece of technology only really suitable for very large clusters of machines; that it carries a large operational burden and that therefore using it for anything less than dozens of machines is overkill. I think that's probably wrong.

I don't think that is wrong. I do think it is probably overkill, and IMO it does introduce operational burden and complexity. That doesn't mean you shouldn't do it, though, if you're interested in exploring the technology, for example.

4 comments

I don't understand what the operational burden is. We literally do nothing to our K8s cluster, and it runs for many months until we make a new updated cluster and blow away the old one. We've never had an issue attributed to K8s in the 2 years we have been running it in production. If we ever did, we'd just again deploy a new cluster in minutes and switch over. Immutable infrastructure.

It is not like I haven't done it the "old" way. I spent many years doing hand deploys, making deployers, running Ansible/Chef. It is just that we always found we can never confidently update servers running many apps as it would step on other applications. So we'd just make new ones, test and switch. This was not an easy process either. Plus we'd encounter issues like oh someone didn't make a startup script or filled up /var with logs, or had something eat up all the memory. All of these operational problems are gone with K8s. I know what you are thinking "well you did it wrong". Yes sometimes developers do things wrong. But in container/K8s land that wrong stuff is contained, and if you don't do things "right" you can't even run.

So we had operational issues there. Now we have a universal platform that someone can ship their app anywhere and have it run the same. That is a huge win. All for no extra work.

Operational burden comes when you have to troubleshoot an issue. Simply deleting and recreating doesnt solve reoccurring problems.
I have had the same experience and the same journey from hand deploys, using configuration management, and all of that.
Why is your comment is gray?
> using it for anything less than dozens of machines is overkill

The question isn't really whether you need dozens of machines, it's whether you can foresee eventually maybe needing dozens of machines.

Remember the bad old days when people said that relational databases were worthless because they "don't scale", that using Mongo and other NoSQL databases were practically a necessity for doing anything modern and "web-scale" because otherwise after you got your big break and you got popular you would need to keep up with all the new traffic and not crash? A lot of engineers have this tendency to worry about scalability long before it's ever a problem. Something about the delusions of grandeur incurred by people who got into engineering because they were inspired by great people building big things.

Starting out by running Kubernetes on a three-node cluster is actually the correct call for a small project if you can reasonably foresee needing to elastically scale your cluster in the future, and don't want to waste days or weeks porting to Kubernetes down the line to deal with your scalability problems that you foresaw having in the first place.

Again, that doesn't mean that Kubernetes is right for every hobbyist project. But there is definitely a (small) subset of hobbyist projects for which it is not overkill.

> A lot of engineers have this tendency to worry about scalability long before it's ever a problem.

"Premature optimization is the root of all evil" -- Donald Knuth

Translating the old suggestions [1] to the realm of devops, I think the point really is: if you are fairly certain that your optimization (k8, docker and so on) will result in "better" code and practices right away, then you should do it. If not, you shouldn't.

I personally find this stuff to be way overkill in a lot of cases. Does Kubernetes really accelerate your development process? If you are a two-men startup, your objective is to find ways to deliver value as quickly as possible, not to play at being Facebook. When you become Facebook (or even just Basecamp), you will then have enough resources to do this optimization. But if you feel your development process is really so much better with K8 and friends (because it's what you used in a previous job or something), by all means go for it.

[1] http://wiki.c2.com/?PrematureOptimization

100% agreed. If a dev thinks they might want to put something in Kubernetes eventually, there are a few best practices that they could adopt early on to make that easy. But basic hygiene aside, the should wait.

The number of projects that might scale up is much, much smaller than the number of projects that do. If I actually want my project to serve a zillion users, the right place for me to focus my effort is not on Kubernetes, but on user context interviews, user tests, and fast iteration based on the results of experiments.

Kubernetes has a definite whiff of NoSQL - a massively hyped tool/technique originating from Google with oversold benefits.

I tried it about 6 months back with the intent of using it in a corporate prod environment and getting set up was... a massive pain in the ass to say the least - compared to the existing ansible set up. It was supposed to solve headaches, not cause them.

I wasn't impressed. I wouldn't be surprised if it ends up being "Angular 1.0" to someone else's react.

i set up a kubernetes cluster 1 year ago at work and a private one last weekend.

last year took, i think 2 days. my private one was up and running within ~1h, including writing the ansible role to first install binaries/dependencies and join the cluster as a worker node.

either you didn't use kubeadm to set it up or ... i have no idea how you could've possibly failed.

its pretty much

    (all) ${packagemanager} install docker-ce kubectl kubelet kubeadm
    (master) kubeadm init -> prints token
    (node) kubeadm join ${token}
Jeff Geerling even wrote an Ansible role to do all of the heavy lifting for you. I've used it alongside vagrant to spin up a three node cluster in ~15 minutes.

https://github.com/geerlingguy/ansible-role-kubernetes

I've used OpenStack a good bit, but not Kubernetes directly, and I have never set it up. Is there an up-to-date, in-depth tutorial around?
He's written so many useful and maintained ansible roles.
I didn't fail, I just couldn't see a strong ROI after doing a spike.

I did use kubeadm. It required considerably more than just 3 simple steps required to get a basic working cluster up. Two days was more like it.

Managing kubernetes yourself is a headache and why I more or less only consider managed services like Google Kubernetes Engine for real use-cases. That's why the original article showing that you could install and run it on a set of 3 micro preemptible instances for ~5 month was so compelling to me.
how does kubeadm take two days. I had to rebuild my cluster recently and it literally took me 20 minutes.
I didn't use it last time. It was marketed as beta iirc
You don't really need to jump into Kubernetes to be prepared to eventually migrate to it. As long as you're using some kind of compatible containerization platform it is not incredibly hard to shift those workloads to Kubernetes. If you're really only needing to run a small handful of containers, running them directly in Docker or rkt isn't challenging and has very little operational overhead compared to running a full k8 cluster.
This echoes my thoughts as well. You can get pretty far with just a simple Docker-Compose file for many personal projects, and still leave the door open to a relatively easy transition to more advanced orchestration tooling should the project grow large enough to require it.

Personally I really like Docker Compose files - virtually no overhead to maintain alongside a handful of Dockerfiles coupled with really simple syntax for expressing the relationships between them. Containers themselves seems conceptually challenging for some newcomers in my experience (concepts like image immutability etc don't have many similar analogues I'd argue if you are new to containers, and I've seen some very experienced developers get stuck trying to map them conceptually to VM images which is not a good fit), but the payoff in the ease of deploying your work is huge. It's nothing a reasonable developer can't learn in a few hours though, and the documentation is pretty good.

I especially find Docker great for projects I worked on years ago - I no longer really need to keep track of how to install/configure the side project I haven't touched in age's software dependencies, just hit 'docker-compose up' and I'm running.

I also really like Docker-Compose as a replacement for things like Vagrant scripts to create developer environments in some scenarios - way less overhead than a vagrant script spinning up multiple VMs on your laptop, and generally much faster deployments.

The question isn't really whether you need dozens of machines, it's whether you can foresee eventually maybe needing dozens of machines.

Kubernetes doesnt manage the machines. It manages the applications on machines that are managed with something else.

You have to do something else to manage the machines

cluster-autoscaler can mange the machines easily, the point is kubernetes is an abstraction layer, using this level of abstraction is upto the user itself. Should i keep going down with container or using the node itself is the question, in production environment it makes all the sense to use something which has the potential to scale infinitely, but for hobby projects mental overhead doesn't justify using k8
You could say that Kubernetes "manages the machines" from the point of view of the application, by providing a suitable environment to execute in (including actually starting the application)
I have to disagree. It doesn't introduce burden and complexity. They're already there whether you use Kubernetes or not.

The difference is that if you did build it all by hand as the author said, if it ever scales, you're going to have double the job to make it scale.

It's all a question of: do I think my software will succeed?

If it's a hobby project that will never get big, it's not worth the hassle. If it actually has a chance of succeeding, the small added complexity of Kubernetes will pay dividends extremely quickly when the system needs to scale.

Even with as little as two machines, I'd argue k8s is already adding more value than managing those by hand. People can say otherwise because they're used to it, but being used to it is not the point of the discussion.

The author also talks about Ansible which is another piece of complexity that would be comparable with doing it in k8s. I'd argue you have less YAMLs with k8s than with Ansible for a small project.

The only argument I see for doing anything by hand today is if it's a play thing.

I see what you mean, but I don't really agree. You can introduce that burden and complexity whenever you want. If you spend the start of your project working on this, you will be prepared for scaling (if you ever need it) but you could have used that time on actually working on the project and checking if you actually will need scale at some point.

I don't know about your projects, but I'm my case most (all) of them doesn't really need any kind of scale. Hell, this blog has a tiny 5$ DO machine and is still happily serving traffic from HN. I do understand not all projects are small blog instances, though :)

I guess in my case I prefer to just keep it simple and see how far I go with that setup than spending time working on making the perfectly-scalable project that is never serving more than 2 requests per second. If it ever grows, I will need to work to make it scale, sure, but on the other hand that is a good problem to have.

Anyway, I understand this is pretty subjective and depends on how you think about your projects and your requirements, so I do understand there will be people both in agreement and disagreement.

I completely agree it's overkill to run your own cluster. That'd be good for a learning experience, but way too complex to use/maintain otherwise.

However I read somewhere you had experience with Kubernetes, right? That means there is no extra work to learn the technology.

Now let's take your blog as an example. I'm gonna guess there's an official Docker image for whatever software you use and you could create an image, deployment + service + ingress in less than an hour for it (pretty much every example out there is about how to setup a blog, heh).

If you have to do all that manually through SSH, I'd argue it takes pretty much the same time and the complexity is the same. You will simply change the tools/concepts but won't be caught by manual gotchas.

> do I think my software will succeed?

With hobby projects for the vast majority of people, the answer is somewhere in the territory of "you can worry about that after it succeeds."

Let's talk about complexity for a moment, for one aspect of a simple service: ingress.

Kubernetes:

    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      name: test-ingress
    spec:
      rules:
      - http:
          paths:
          - path: /testpath
            backend:
              serviceName: test
              servicePort: 80
Nginx:

    location /testpath {
        proxy_path http://127.0.0.1:8080
    }
Which is less complex? Which is beta, and thus could be changed over time (it happens a lot). Which one requires major (and breaking) infrastructure updates every 3 months?

> I'd argue you have less YAMLs with k8s than with Ansible for a small project.

Since you typically need one yaml document per K8s resource, and can describe multiple independent and idempotant actions in one Ansible document, I think this is easily demonstrable as false for small projects (and likely big projects as well).

The first one is a complete configuration you could kubectl apply into a cluster that sends traffic to a backend service that may be running across multiple instances on multiple machines.

The second is a configuration fragment that is useless by itself that would send all traffic to a single instance running on localhost.

You can just drop it in a conf.d folder created by the package install and it would work just fine.

Also we're talking simple projects, and simple projects are typically not distributed; typically don't need to be distributed.

That said, supporting multiple backends with different load balancing algorithms is also pretty simple to write as well.

Ok, great... I have a bunch of simple projects like that, one web instance running on a single host.

How do I safely upgrade it without downtime? Ensuring that the new version starts up properly and can receive traffic before sending it requests and stopping the old one?

With k8s: kubectl set image deployment/my-deployment mycontainer=myimage:1.9.1 (or just use kubectl apply -f)

With your nginx config: ????

That moves the goalposts a bit. We've gone from a simple service to a fleet of highly available services on multiple hosts with zero downtime requirements.

At which point, sure, use Kubernetes.

I think this is proving my point. They're basically the same, but one is completely dynamic and the other will have to be changed as soon as anything changes.

We've been using Kubernetes in production for almost two years now and I have yet to face a big API change that breaks everything. The core APIs are stable. There's a lot of new stuff added, but nothing that breaks backwards compatibility.

As you just said, if it's a simple project, you can upgrade the infrastructure by clicking "upgrade" on GKE. We've only ever hit problems when upgrading when using the bleeding edge stuff and the occasional bug (once since Kubernetes 1.1 to 1.10 for a large Rails app).

Regarding the yaml document per resource, I mean... that's spaghetti Ansible. If you want to have a proper Ansible setup you will have separate tasks and roles. If we're going down the route of just have "less files" you can have all the Kubernetes resources in a single YAML file. I would definitely not recommend that tho.

While Kubernetes is a lot more verbose, it is light years better than the Ansible jinja2 weird syntax. Even someone that never heard of Kubernetes can read that Ingress resource and guess what it does. If we're being really picky actually, the proxy_path should be pointing to some "test" thing that would have to be an upstream that would already make the NGINX config more obscure.

I feel like people just hate Kubernetes to avoid change. These arguments are never about Kubernetes' strengths or faults but about why "I think my way is better". You can always do things in millions of ways and they all have tradeoffs. The fact is: ROI on Kubernetes is a lot bigger than any other kind of manual setup.

I'll repeat what I said on another comment: the only reason to do anything by hand today is if it's a play thing

Are you using GKE for your production environment? Kubernetes has lots of great features, and definitely imposes some good operational patterns, but it's no panacea, and for those of us who can't put their applications in the cloud (or don't want to) Kubernetes can be a complex beast to integrate.

Are you running databases in Kubernetes? Do you have any existing on-prem infrastructure that you want to utilize (or are required to, because sunk costs) to integrate with Kubernetes?

Is you company a startup with no existing legacy application that you need to figure out how to containerize before you can even think about getting it into Kubernetes? We've seen benefits from running it, for sure, but I'm honestly not sure if the amount of work it took (and still takes) to make it work for us was worth the ROI.

Sometimes I feel like Kubernetes is a play by Google to get everyone using their methodology, but only they can provide the seamless experience of GKE.

I think the key element here is that you are using GKE. Managed cloud Kubernetes, and self-hosted on-prem Kubernetes are two different beasts. Yes, it's easy when you don't actually have to run the cluster yourself.

Ansible has its warts, but it is great for managing and configuring individual servers.

Yes I agree. But in the context of this article, we're talking about small projects. So I wouldn't expect the need for an on-prem setup.

In fact I've used Kubespray[1] to setup a cluster with Ansible before with mild success. Nothing production ready, but it's actually a good tool for that job. At the end of the day you can't run Kubernetes on Kubernetes :D

[1]: https://github.com/kubernetes-incubator/kubespray

I really think that all Kubernetes related posts and articles need a footnote -

"Kubernetes allows you to scale simply!"*

"Kubernetes takes minutes to set up!"*

*When using GKE or another managed service.

I guess it depends on your definition of "small projects". I agree with the article that if you are interested in getting something out there for people to use and see what kind of interest you get, then adding Kubernetes to the mix doesn't really get you there faster. If anything, I think it would slow you down, unless we are talking about a very trivial app.

I was responding more to the comments I had been reading, not the premise of the article.

How do you configure it with a master postgress db that is persistent over reboots, hooked up to two hot slaves? With ability to do of site backup?

All deployments I have seen so far have been immutable infi scale webapps. That is easy.

That's true. Stateless apps are a lot easier. I have myself had to put a stateful service in K8s even before they had StatefulSets and it wasn't a walk in the park.

However doing what you described is hard... anywhere.

Even if you do it all by hand it's going to be hard and most likely brittle. It might take a bit more time/effort to do that on Kubernetes but I would say you would end up being a better solution that can actually sustain change.

As stated before by many, Kubernetes is not a magic wand. But it does force you to build things in a way that can sustain change without a big overhead.

We all know that different tools have different purposes and I'm not advocating it's perfect by any means. All I'm trying to say is that this idea - that a lot of people have - that Kubernetes is 10x more complex than doing X by hand is an illusion.

I think the number of containers rather than the number of machines should be the leading factor.

IMHO it makes sense for most setups that have multiple micro-services that need to interact with each other. A single node cluster running a single container is kind of pointless; I agree. And you are not going to run much more than that on a micro instance. So, I agree with the main point of the article that this probably is not an appropriate setup for any kind of home setup unless of course you really want to have kubernetes (which would be a valid reason for attempting this).

If you run multiple microservices you have most of the problems that kubernetes solves out of the box and attempting to solve those by manually gobbling together bits of infrastructure outweighs the financial overhead of running kubernetes. So any moderately small setup where you are in any case going to have 2 or 3 machines running multiple containers, you probably should be looking at kubernetes.

So, if you are in Google or amazon, hosted kubernetes is definitely worth considering. You probably want a loadbalancer as well. So, at that point you are looking at ~50-100+$ per month anyway for a couple of instances, a LB and whatever else you need (e.g. RDS, S3, etc).

For anything running commercially, that's entirely defensible. Yes you can run cheaper on bare metal but people tend to forget all the hours doing devops stuff are also cost you. A day of a competent dev will easily run you kubernetes for quite some time. Unless your devs are super bored, make them spend their hours on more valuable stuff than reinventing wheels.