Hacker News new | ask | show | jobs
by icedchai 1638 days ago
These costs are highly inflated. Not sure why you need 4 people to operate a small 6 node cluster. From my own personal experience, one guy can do that part time. Your cluster has redundancy, so most problems can wait to be dealt with during regular hours.
3 comments

Having 4 people is actually not enough I would argue.

BUT!

That's because I advocate for people being able to go on vacations without worrying about work and to be able to not be on-call 24/7/365.

I.e. you don't need them full time and you also need them for all other options, because you need to cover on-call for anything you use. Nobody at AWS or Google will so much as blink when your precious infrastructure blows up on you and you don't know what to do. You need brains to work on those things yourself. You want to only have people on-call every 5 or 6 weeks.

How exactly to calculate the percentages on those salaries, that I am not sure of, but it definitely isn't a full time job to babysit infrastructure like that with 4 full-time brains doing nothing but watching your monitoring dashboards!

As someone who operates four medium-sized (20-200 nodes) clusters as a very small part of my job on a two-person team, I agree. I spend most of my days writing new operators to automate infrastructure, not manually managing kube... said team also own several large kafka brokers handling several billion messages a day, the full ci/cd pipeline, logging/metrics/tracing stack, a sharded postgres operator, authn/z, terraform automation, etc., etc.

We need more people, but kube is the least of our headaches.

As a counterpoint from my own personal experience, time spent managing kubernetes is time NOT spent introducing new features.

I once had the displeasure of watching an upstart data science team at a boutique portfolio management firm break under the weight of k8s management. The team was great at critical thinking, risk-modeling, and statistical analysis. They knew very little about infrastructure, such as networking fundamentals or CPU/Memory management. This team went from helping our firm rapidly develop a sentiment analysis model that sifts through social media for trading signals to wrestling with kubectl all day. Productivity bombed. Team was disbanded about a year after they started using k8s.

It's a shame the author emphasized and inflated the discrete costs of operating a cluster because that emphasis and exaggeration distracts from the TRUE cost ... opportunity cost. We want our engineering teams doing what they do best, writing code and adding new features. Not bogged down in managing the plumbing/infrastructure. Happy to have that abstracted away.

Sounds like they needed an ops person, sysadmin, etc. This is nothing new: Most developers are not good at being sysadmins.
Just wait until an upstart sysops team will try to ship some trading software