| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lmilcin 1744 days ago

> It can be complicated, but usually it's not.

The usual story. Everything works, until it doesn't.

If you are a huge corp with good engineering you can have people dedicated to understanding k8s and then it kinda makes sense. They can spend time to learn it really well so that they have necessary chops to deal with problems when they happen.

On the other hand, if you are smaller company, you are more likely embracing this new idea of developers running everything including k8s, you are in for a trouble.

They will know how to make it work but that's about it.

Because if you need to learn everything you actually learn nothing very well. And there certainly isn't enough time in the world to learn everything in development.

My philosophy is applications must be built for when it breaks and it is unacceptable to run an application with a team that will not be able to fix it if it breaks.

**

Couple of years ago I joined a small group of teams of developers (together about 40 devs) who together maintained a collection of 140 services, all with same or very similar stack (Java, Spring, REST, RabbitMQ).

They had trouble delivering anything to prod because of complex dependencies, requirements, complex networking, complex process to find out where stuff broke between 7 layers of services between the original user call and the actual execution of their action.

I rolled my sleeves and put everything in a single repo, single build, single set of dependencies, single version, single deployment process, single layer of static app servers.

I left after the team was reduced from 40 to 5. There was no problem delivering anything (we had 60 successful deployments in a row when I left) and the guys who were left admitted they are bored and underutilized because of how easy it is to make changes to the system.

These were still the same guys that were advocating for microservices. From what I heard they are not advocating for microservices anymore.

Can microservices be done well? Sure they can. But it takes additional effort and experience to do it well. Why make your life difficult when it is not needed?

1 comments

cameronh90 1744 days ago

"The usual story. Everything works, until it doesn't."

But the same is true of all the other orchestration tools isn't it?

I've had similarly complicated problems with Terraform, Ansible, Chef and Puppet and just plain Linux as I have had with Kubernetes. Meanwhile K8S saves a lot of time when things do work properly - which is nearly always.

A while ago, we had an issue with dotnet where the JIT was sometimes emitting bad instructions and crashing the process. That was an absolute bloody nightmare to debug and reproduce, it took weeks - but nobody would say running a high level language is bad because the compiler might have a bug, right?

We are a small company (under 20 developers), we have one dedicated ops person and one devops, and have never had any issues with k8s that couldn't be resolved by one of them within a few hours. We run a monorepo with 6 app services as part of our core product, 10 beta/research services, then a handful more infrastructure services (redis etc.), and honestly it's been the smoothest environment I've ever worked with. All the infrastructure is defined in code, can be released onto a blank AWS account (or k3s instance) in minutes, all scales up and down dynamically with load, and most of the time something goes wrong it's a bug in our code.

Maybe the problem with your system was more about the excessive use of microservices and general system architecture rather than Kubernetes itself?

lmilcin 1744 days ago

> But the same is true of all the other orchestration tools isn't it?

Of course. The difference being how complicated it is to deal with problems.

For example I find it is way easier to deal with problems with Ansible compared to Chef.

So, assuming that both get me what I need, I prefer Ansible because it is less drag for when I have least time available to babysit it (which usually happens at least opportune moment).

What I am trying to say is that, just because it works for you now doesn't mean it will not end in a disaster at some time in the future. It is not my position to tell you if the risk is acceptable for you. But I personally try to avoid situations from which I cannot back easily.

If I have a script that starts an application on a regular VM I KNOW I can fix it whatever may happen to it. Not that I advocate running your services with a script, I just mean there is a spectrum of possible solutions with tradeoffs and it is good to understand those tradeoffs.

Some of those tradeoffs are not easily visible because they may only show themselves in special situations or, opposite, be so spread over time and over your domain that you just don't perceive the little drag you get on everything you do.

I find that if there is any overarching principle to build better solutions it is simplicity.

Presented with two solutions to the problem, the simpler solution is almost always better (the issue being the exact definition what it means to be simpler).

For example, I have joined many teams in the past that had huge problems with their applications. I met teams that were very advanced (they liked overcomplicating their code) and I met teams that could barely develop (they had trouble even writing code, did not know or use more advanced patterns).

I found that it is easier to help teams that had huge problems but wrote stupid code because it is easier to refactor stupid simple code that beginner developers write than it is to try to unpack extremely convoluted structures that "advanced" developers make.

I think similar applies to infrastructure.

For example, when faced with an outage I would usually prefer simpler infrastructure that I know I understand every element of.