|
|
|
|
|
by ttymck
774 days ago
|
|
For my applications, monitored by prometheus + grafana, we have alerts when no data is reported for certain metrics in the past 5 minutes, indicating a malfunction in the subsystem. With a metric, you can use a monotonic counter to serve as a heartbeat. A timestamp would work. In your monitoring system, when the heartbeat value has not increased in X minutes, you alert. |
|