| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by arka2147483647 1542 days ago
	How would fault tolerance help with privilege escalation? Would it not just mean that you have more computers to update in your redundant/tolerant cluster?

3 comments

hnlmorg 1542 days ago

In this context it means you can take individual servers offline without taking your entire service down. So you can then update each server (even on production systems) live without requiring a maintenance window.

For bonus points you're also not babysitting manually provisioned servers but instead have your software installs automated. So any failure on a server or OS update isn't seen as a maintenance piece but rather just terminating the old server and letting your pipeline auto-build a new server. This is often referred to as "treating your servers as cattle rather than pets", though not everyone likes that analogy.

link

jhugo 1542 days ago

In the context of a cluster, fault tolerance allows you to replace nodes without downtime. With automation a kernel update can then be a routine, low effort, low stress task.

Honestly, a kernel update has to be a routine, low effort, low stress task. It's a common event that should be seen as part of the normal operation of the system, not as some exceptional event that means someone has to work on the weekend.

link

kasey_junk 1542 days ago

Theoretically it means that you can be running regular node OS updates as a matter of course simply by replacing some percentage of them on a rapid cadence.

Then there isn’t any stress to doing it, it’s routine and automated.

link