Hacker News new | ask | show | jobs
by justin_oaks 1164 days ago
I had a boss who had an inbox with literally hundreds of thousands of unread emails. A good chunk of those emails were "success" messages from batch processes.

It's quite correct to send a "success" message when a batch process is completed successfully, but it's quite wrong to send that message to a human. It should be sent to a machine that should translate a missing success message into an error message/alert for humans to respond to.

For example, I have a set of nightly backup jobs. The last step of each backup process is to send a success message to my monitoring system. I only get a "Missing Backup" alert when the monitoring system detects that it didn't receive the success message it expected for a particular backup.

My old boss didn't seem to understand the concept that people don't generally notice missing messages. Or he was too lazy/incompetent to use a monitoring system that could translate gaps in successes into errors.

1 comments

Even that is utterly unnecessary because we use ControlM for basically all of the batch work in my area that I know of and there's already automation that opens an Incident on a job failure that can flow into the whole on call system! If a job or cycle is critical and needs to finish by a certain time you can setup messages to go out at that time and everything.