Hacker News new | ask | show | jobs
by billoday 2953 days ago
For us, it tends to be driven by the language/logging framework used. When using syslog-type logging messages, we log at critical/fatal for the REALLY bad ones. When using npm-type logging, we try to have the string CRITICAL in the message, triggering the alert on a single occurrence of such messages. General alerts have an x over y time trigger based upon usage/traffic/noise. We also generally encourage using warn as the default error state, only hitting error when things are broken.