| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by trumpeta 1946 days ago

Maybe I can offer an answer to your question, I have worked at a couple of companies where we ran "small" scale k8s clusters (1-100 nodes as you say).

We have chosen k8s and I would again, because its nice to use. Its not necessarily easier, as you point out, the complexity of managing the cluster is considerable. But if you use a managed cluster like EKS or DO's k8s offering, you don't have to worry too much about the nodes and the unit of worry is the k8s config and then for deployment you can use Docker.

I like Docker, because its nice. Its nice to have the same setup locally as you have remotely.

In my experience the tooling around k8s is nice to manage declaratively, I never liked working with machines directly because even tools like Chef or Ansible feel very flimsy.

The other thing you can do is run on ECS or similar, but there the flexibility is a lot lower. So k8s for me offers the sweet spot of being able to do a lot quickly with a nice declarative interface.

I'd be interested to hear your take on how to best run a small cluster though.

1 comments

grey-area 1946 days ago

Thanks, that's really interesting. Everyone has different challenges and requirements, and of course different experiences.

For smaller setups (say 1-10 services) I'm quite happy with cloud config and one VM per process behind one load balancer per service. It's simple to set up, scale and reproduce. This setup doesn't autoscale, but I've never really felt the need. We use Go and deploy one static binary per service at work with minimal dependencies so docker has never been very interesting. We could redeploy almost all the services we run within minutes if required with no data loss, so that bit feels similar to K8s I imagine.

For even smaller companies (many services at many companies) a single reliable server per service is often fine - it depends of course on things like uptime requirements for that service but not everything is of critical importance and sometimes uptime can be higher with a single untouched service.

I think what I'd worry about with a k8s config which affects live deployments is that I could make a tweak which seemed reasonable in isolation but broke things in inscrutable ways - many outages at big companies seem to be related to config changes nowadays.

With a simpler setup there is less chance of bringing everything down with a config change, because things are relatively static after deploy.

link

je42 1946 days ago

>We use Go and deploy one static binary per service at work with minimal dependencies so docker has never been very interesting.

how do you deploy your static binary to the server? (without much downtime ?)

link

grey-area 1946 days ago

Sorry that should have said one binary per node really, not per service (though it is one binary per service, just on a few nodes for redundancy and load).

Services behind a load balancer so one node at a time replaced then restarted behind that, and/or you can do graceful restarts. There are a few ways.

They're run as systemd units and of course could restart for other reasons (OS Update, crash, OOM, hardware swapped out by host) - haven't noticed any problems related to that or deploys and I imagine the story is the same for other methods of running services (e.g. docker). As there is a load balancer individual nodes going down for a short time doesn't matter much.

link

notyourday 1946 days ago

> how do you deploy your static binary to the server? (without much downtime ?)

Ask yourself how would you solve this problem if you deployed by hand and automate that.

1. Create a brain-dead registry that gets information about what runs where (service name, ip address:port number, id, git commit, service state, last healthy_at). If you want to go crazy, do it 3x.

2. Have haproxy or nginx use the registry to build a communication map between services.

You are done.

For extra credit ( which is nearly cost free ) with 1. you now can build a brain-dead simple control plane by sticking an interface to 1 that lets someone/something toggle services automatically. For example, if you add percentage gauge to services, you can do hitless rolling deploys or cannery deploys.

link