|
|
|
|
|
by jbyers
6745 days ago
|
|
Consider collectd (http://collectd.org/). Unlike a lot of the usual suspects, collectd is a daemon that records the usual server health stats every 10 seconds into rrd files. After running it for two years on a few dozen systems, it's never failed or caused undue load on its own. We often see events that would have gone completely unnoticed in a 5-minute monitoring window. |
|