Hacker News new | ask | show | jobs
by olalonde 3596 days ago
I just used `./cluster/kube-up.sh` to setup my cluster on AWS. I am now wondering what's missing for a production setup. It seems to be working OK so far (though I just have 3 minions and a few pods). One thing I wish I knew how to do is how to safely upgrade the cluster without re-creating it from scratch. Care to elaborate a bit?
5 comments

The problem is that it's not declarative. You can't tweak the config and run it again to converge.

Kubeup is designed to run once, unlike systems such as Puppet and Terraform that declaratively set up the world to fit your specification.

Kubeup also does a lot of mysterious stuff. By using it, you don't have a clear idea of which pieces have been set up and how they slot into each other. It is, in short, opaque and magical.

For comparison, I set up Kubernetes with Salt on AWS. It was, by all means, "the hard way", and took me a few days to get running and a couple of weeks to run completely correctly (a lot of stuff, like kubeconfig and TLS behaviour, is still undocumented), but as a byproduct I now have the entire setup in a reproducible, self-documenting, version-controlled config.

Have you by chance open sourced your setup? I started going down this route with terraform, but ended up stopping and just using the kube-up script due to time constraints.

However, now that I have a cluster up and running, I can take the time to build a parallel cluster with more understanding, and migrate the services to it.

I found a terraform example, but it declares itself out of date, and looked more complicated than I thought it should be... that was just a gut feel though.

I have not used Salt, but I always like to learn new things, especially if they make my life easier.

I'd be very interested in checkout out your setup, and any lessons learned you have to share.

Thanks in advance

This project looks very interesting, and once I'm more comfortable at a lower level of abstraction, I may use this.
I haven't, but I would be happy to.

I just need to generalize it a little bit. Email me and I will send you a link once it's done, sometime next week (I'm on vacation).

Hi, I would just like to second the request for an open-source, production-ready implementation. They seem to be rarely shared, which is a big shame.

I've been playing with https://github.com/kz8s/tack lately, but your implemention sounds like even more comprehensive.

I used this as reference - https://github.com/samsung-cnct/kraken

Terraform + Ansible for K8S on CoreOS

Thanks for the tip. I will take a closer look at this.
Oh yeah I agree, that's annoying. I considered using one of the few projects that attempt to solve this problem but decided to stick with `kube-up.sh` because, as a beginner, I'd have a hard time telling which pieces belong to Kubernetes vs the third party tool. I also don't have time to become a Kubernetes expert because I'm crazy busy developing. Hopefully, Kubernetes will eventually obsolete `kube-up.sh` for something better based on Salt/CloudFormation/Terraform/etc.
Watch this tool called kops - https://github.com/kubernetes/kops

Official upcoming replacement for kube-up

Kube-up actually uses Salt for some of the setup. But it's a big mess, since it needs to support a lot of platforms, Linuxes and cloud providers.

I don't know what the future of Kubernetes setup is, exactly, but right now it's quite safe to settle on Salt, Puppet, Ansible or Terraform. I haven't used Terraform, so I don't know how suitable it is to OS-level setup (things that the aforementioned tools are good at), as opposed to orchestration.

When you click the GKE(hosted Kubernetes) button on Google Cloud Platform, it's those very Salt configs that set up your nodes (and once upon a time they setup your control plane too).

Basically what's arguably the best publicly available Kubernetes setup in the world is hiding in that Salt codebase, and EVERY would-be Kubernetes admin should look at it before venturing on their own.

I used it as inspiration for my setup. But you also need a bunch of other stuff, like the CA setup and Kubelet cert generation, which are buried in the whole kube-up structure.
Yeah, agreed. CoreOS + cloud-init could remove necessity to do OS-level setup.
There's always something to do at the OS-level, which is why cloud-init configurations tend to spiral out of control with in-line scripts, configs, and binary downloads.

There's no getting away from configuration management and software installation at SOME level of your stack, and setting up a substrate for Kubernetes is no exception.

Nah, who needs all these version-controlled configs?

Just look how google runs stuff! https://cloud.google.com/compute/docs/tutorials/setup-joomla

If you're looking for something that is a little more flexible for deploying Kubernetes, I recommend either KOPS[1] or kube-aws[2]. kube-aws is tethered to AWS but is much more flexible than the standard kube-up.sh script. KOPS is the heaviest lifting tool I've found for deploying Kubernetes. It's short for Kubernetes Ops and (I believe) it can even generate Teraform configs so you can get the upgrades without re-creating everything.

[1] https://github.com/kubernetes/kops [2] https://github.com/coreos/coreos-kubernetes/tree/master/mult...

KOPS is pretty awesome in that:

* Actually pretty much works for what's in scope..

* It's got some nice configuration options that are discoverable and not hidden away in envars...

* Some good prelim docs explaining how kubernetes is bootstrapped

* Cluster management seems to function properly

* Updating/upgrading

What's missing IMHO(from an AWS user's standpoint including kops and k8s):

* SUPER unapproachable codebase ATM for KOPS and friends

* More flexible cluster dns naming so we can leverage real wildcard certs accross dev environments

* Running kubernetes in private networks

* Passing in existing networks created through other tools(terraform, cloudformation, custom etc)

* Responsibility for stuff seems spread out across projects and is unclear which lies where(also leading to an unapproachable-ness for contributions)

* AWS controllers that don't seem to fully leverage the AWS API's (traffic balanced to all nodes and then proxy'd via kube proxy; no autoscale life cycle event hooks)

* Unclear situation on the status of ingress controllers; are they even in use now or is it all the old way?!

* No audit trails

* IAM roles for pods

* Stuff I'm probably missing

It's very frustrating TBH. On one hand AWS ECS has IAM roles for containers now, for the new Application Loadbalancer, and private subnet support. On the other hand they DON't have pet sets, automatic EBS volume mounting(WTF), a secrets store, configuration API, etc. Also frustrating is I feel the barrier to contribute is a too high ATM even though I have the skills necessary..

It's SO close though. If I can get private, existing subnet support I can probably start running auto provisioned clusters that are of use for some of our ancillary services in production. From there I might be able to help contribute to KOPS and AWS controllers. Right now it looks like there is just this one guy doing most of the work on AWS and KOPS; probably quite overloaded.

IAM roles for pods: https://github.com/jtblin/kube2iam

Running kubernetes in private networks: You could probably get private subnet support by - Deploying manually or deploying with a script, then changing things in AWS (route tables, public IP, etc) to be private, manually afterwards (both cumbersome but possible) - Using NodePort instead of LoadBalancer on any services

(Currently work in progress, but working. Some of these statements are forward-looking.)

Lol

That is perfectly fine if it suits your use case! I have to deal with industry certifications and unfortunately using a ad-hoc certificate authority is not an option or running in insecure ports.

Also I was setting it up on coreos and baremetal servers. It should be possible to run pods in google container engine or similar very easily, but would there be any fun in that?

The CoreOS team has worked a ton to improve the baremetal installation experience for Kubernetes. You can read more about it here:

https://coreos.com/kubernetes/docs/latest/kubernetes-on-bare...

And the installation flow that builds on top of that for Tectonic:

https://tectonic.com/blog/tectonic-1-3-release.html

Right, no industry certification to follow here and pretty loose availability requirements. You just had me worried for a minute that everything would suddenly grind to an hal or that there were glaring security holes! But my "production" requirements are definitely not as strong as yours.
If you move etcd to separate t2.nano then upgrading is easy. The only stateful part is etcd.
Have you tried shooting a node in the head and seeing what happens? Always a good exercise to run. Run a few disaster recovery exercises and see if you can get it back. I recommend doing that on non-production of course!
Thanks for the tip. I did yesterday actually by manually shutting down the node from SSH (sudo shutdown). It seemed to "just work" without having to do anything else. There might have been a tiny period of unavailability to one of my services but not enough for me to notice. Luckily, I don't have crazy high availability requirements yet.