In this context it means you can take individual servers offline without taking your entire service down. So you can then update each server (even on production systems) live without requiring a maintenance window.
For bonus points you're also not babysitting manually provisioned servers but instead have your software installs automated. So any failure on a server or OS update isn't seen as a maintenance piece but rather just terminating the old server and letting your pipeline auto-build a new server. This is often referred to as "treating your servers as cattle rather than pets", though not everyone likes that analogy.
In the context of a cluster, fault tolerance allows you to replace nodes without downtime. With automation a kernel update can then be a routine, low effort, low stress task.
Honestly, a kernel update has to be a routine, low effort, low stress task. It's a common event that should be seen as part of the normal operation of the system, not as some exceptional event that means someone has to work on the weekend.
Theoretically it means that you can be running regular node OS updates as a matter of course simply by replacing some percentage of them on a rapid cadence.
Then there isn’t any stress to doing it, it’s routine and automated.
For bonus points you're also not babysitting manually provisioned servers but instead have your software installs automated. So any failure on a server or OS update isn't seen as a maintenance piece but rather just terminating the old server and letting your pipeline auto-build a new server. This is often referred to as "treating your servers as cattle rather than pets", though not everyone likes that analogy.