Hacker News new | ask | show | jobs
by Wandfarbe 2143 days ago
I'm not sure if you have seen a proper build and maintained automated e2e lifecycle.

You write code, you push it, 5 Minutes later it is rolling out, tested, with health checks and health metrics.

Your infrastructure itself is keeping itself up to date (nightly image builds, e2e tests etc.)

It just works and runs. It doesn't make the same mistake twice, it doesn't need an expert to be used.

I'm not saying its for everyone! Put three/four VMs on AWS, add a managed database and you are good to go with your ansible. Use a Jira plugin to create reaccuring Tickets for doing your maintenance and that should work fine.

Nonethless, based on your 'random list of things' it does sound like you are not doing it right.

There is something very wrong if you really think its critical for you to be able to 'hot fix' aka playing hero by jumping on your vms and hacking around. IF you only one single server for your production env. there is no risk of forgetting a server to patch but there is still the issue of forgetting to backport it (which is probably the wrong term if you don't hotfix your release branch)

Most mistakes i do, are mistakes i do because i was able to do them.

And that might sound unfortunate but there is a feeling of trust for your setup. At least i get that feeling and i get that through automatisation. Knowing and seeing the deployprocess just working day in day out. Knowing that my monitoring and alerting is setup properly, knowing that the systems keep themselfs up to date, knowing there are proper tests in place.

2 comments

> You write code, you push it, 5 Minutes later it is rolling out, tested, with health checks and health metrics.

Yep, I have that. It's more like 15 minutes than 5 for me, but this process has nothing to do with containers - it can be done in the same way without containers.

> It just works and runs. It doesn't make the same mistake twice, it doesn't need an expert to be used.

Except when you misconfigure something on friday night and it does the same mistake 100 times per hour until someone notices it.

This will happen once and then there will be a test for it.

My automated system will only get more resiliant over time. This is a benefit for the system itself.

Of course when you do it manually, you will learn and gain experience but thats only for YOU. It does not just get transfered to your colleagues and when you are on holiday and shit hits the fan, it will not help.

My biggest reason why i like automatisation so much is: the company becomes less reliant on me.

It is the same mechanism why the industry is getting more computer logic: Machines are complicated and you need to train people. Make it easier for people to 'just use' and you have more people available which you need to train less.

That's why there is a don't deploy on Friday rule.
Yeah, I don't think that's a new problem with containers.