Not everyone is into gaming. I rather code on my side projects than use my console. Or people tweak and customize their Linux installation instead of doing work on it. Some people like to work on their cars, driving is a small part of it.
I agree, and I am as guilty of procrastination. However, the author is not really procrastinating—he gets paid for this. Me, I do in fact procrastinate on setting up a Minecraft server infra in the cloud. Maybe that’s precisely why the solution to this problem strikes me as inadequate:
> So, the Minecraft server should work reliably and, if it goes down, I should know well before they do
How are metrics helpful? There is so much fun that could be had in setting up an actually resilient system instead.
Why worry over metrics and alerts when you could orchestrate an infrastructure that grants you the superpower of being able to spin up a server with a copy of the world on a whim instead (or even a system that auto-starts one whenever there is demand)?
You are somehow very negative about this piece and are not understanding that your definition of fun is not universal.
As you said "There is so much fun that could be had in setting up an actually resilient system instead.", maybe the author has more fun setting up alerts and metrics instead of a resilient system like you do?
The truth is that in most real-world scenarios getting alerts, metrics is much more important than building a fully resilient system (Expensive, maybe overengieering for early stage etc.).
> However, the author is not really procrastinating—he gets paid for this.
As the first sentence in the blog post says "One of the secret pleasures of life is to be paid for things you would do for free.", which I can very much understand as I often work or play with things I could use at work in my free time.
> How are metrics helpful? There is so much fun that could be had in setting up an actually resilient system instead.
Metrics are the means to an end of alerting. And with alerting, I mean getting pinged on my phone when something important breaks. Like, you know, the server going down.
> Why worry over metrics and alerts when you could orchestrate an infrastructure that grants you the superpower of being able to spin up a server with a copy of the world on a whim instead (or even a system that auto-starts one whenever there is demand)?
As somebody who has run cloud and enterprise software for almost two decades now, I can be that needs monitoring too. The more moving parts there are, the more things go wrong. The more things go wrong, and the more you care they get fixed, the more monitoring you need :-)
I believe you! Just due to your affiliation I wanted to highlight to any newbie SREs in the audience that perhaps there is a better way. I still think my approach is better, but we can do things differently.
Indeed if there were “official” container images out there, I might have instead run the server on Google Cloud Run or AWS AppRunner, without having to take care of the Linux underneath. Or an Amazon ECS task. I don’t have a Kubernetes cluster, but I will at some point make a version of this blog to run it on K8s :-)