Hacker News new | ask | show | jobs
by closeparen 1423 days ago
Running a microservices architecture on a shared cluster is complicated.

* Because you are running multiple workloads on the same kernel, you need to protect them from each other, from both correctness and performance perspectives.

* Because the workloads have different scaling characteristics, you need to solve a bin-packing problem to make efficient use of the resources.

* Because of said bin-packing, workloads move around a lot, so you invite a service discovery problem much more intense than classic DNS is meant to solve.

* Because normal software only knows how to use DNS, you invite the need for sidecars and virtual network overlays.

When you have this set of problems then k8s seems appropriate. I work at a company with this set of problems but no Kubernetes, and boy is it a production. But my takeaway from that is less "use Kubernetes" and more "try not to have those problems." Use relatively monolithic architectures, one or a few services on their own stable pools of nodes, with boring old reverse-proxy load balancers and DNS, for as long as you possibly can.

1 comments

I have come at this problem from a bit of a different angle by asking what is the closest I can possibly get to the hypothetical dream state of everything is automated, autoscaling blah blah blah as possible with the absolute smallest budget in terms of not only actual costs but time budget as well.

I only know the GCP ecosystem kind of well so I don't fully know to what extent these things exist in AWS and Azure but there I think there is a really nice path you can get on with the serverless route that skips K8s entirely but keeps you very well aligned in case you ever need to "upgrade" or get out of the GCP ecosystem.

I write very stock standard gRPC services and then put them onto Cloud Run (which has a very Heroku like workflow) and stick https://cloud.google.com/api-gateway in front of things and now my API is running on the exact same setup as any other service Google is running in production. Huge amounts of logic get moved out of my code base as a result.

If you are also willing to write your APIs a fairly particular way https://google.aip.dev/ it starts to become trivial to integrate other things like https://cloud.google.com/workflows, https://cloud.google.com/pubsub and https://cloud.google.com/tasks which is traditionally where a lot of the "state" and weirdly complicated logic previously lived in my code. I'm now not really writing any of that.

Now it's all declarative where I just say what I want to happen and I don't have to think about much else beyond that because it too is using that same internal GCP infrastructure to handle all the complicated parts around what to do when things go wrong.

But to me they are all extremely heavily aligned with the K8s path so the lock in certainly doesn't feel as scary.

You have to still ask the question why not just deploy the monolith to the VM and move on until you need to think about anything else?

For me it’s not JUST an investment in the future but in the very real immediate benefits it’s the huge amount of code I don’t have to write and all the insanely advanced stuff I get for free.

To give a quick example of each though. On the code side I don’t think about things like authN, authZ, retry logic, health checking, most security outside of things like input validation, logging, tracing etc. All of that now is just a configuration setting for me.

Then on the advanced features side, just to give one example. If I set a security policy saying this service can do the following actions on another service and I don’t end up using most of those permissions it will automatically notify me in the future to let me know and help me rewrite the security policy to only use the things I need in practice, it can even help me test that new policy to ensure things don’t break… That was previously basically an entire persons job to find and fix those kinds of things, usually manually because the amount of code needed to automate it is painful to think about. Now it’s just another feature I pick up for free.