|
|
|
|
|
by krisoft
745 days ago
|
|
> When you read industrial accident reports one extremely common theme is some sensor that was notifying the controller of the problem, but … I remember an accident report. It was about a container ship which had a bad flooding incident in their engineering spaces. One thing the report pointed out that the engineers had ways to fight the flooding, but they were not doing them because they were playing whack-a-mole with all the alarms caused by the flood. If i recall correctly the engineers kept ignoring the waist deep and rising water and prioritised silencing the alarms. (And not because they were stupid, but just because the many independent blaring alarms task-saturated them.) |
|
That's common in computer monitoring systems, at my last job when we had a serious outage, we'd get dozens of pager alerts, it was hard to figure out the root cause because so many alerts fired that were caused by the root cause. I.e. like if the root cause was a root volume was out of disk space, the "unable to log in" alert was superfluous and not helpful. Eventually we moved to a better system that had a betrer sense of hierarchy for alerts as well as a way to easily silence them.