|
|
|
|
|
by tpxl
1386 days ago
|
|
I'm going to disagree here. The point of a manually updated status page is appearances. With proper reporting it's trivial to know which subsystem is experiencing problems, if any. It doesn't have to be very granular, just "normal", "experiencing issues", "offline". If reporting doesn't work, you should be alerted it doesn't work, and if alerting doesn't work, there needs to either be out-of-band alerting for that or someone monitoring the status at all times. Manual overrides for status pages should exist for when the automation doesn't work of course. At my last job we had a big screen in the office we monitored (Grafana) and we usually saw problems before the alerting kicked in - it had about a minute delay. When not in-office/during work hours, the on-call received alerts. It wasn't technically nor organisationally complex. |
|