|
|
|
|
|
by bombcar
1102 days ago
|
|
What does the monitoring actually do for you? I've seen these setups, even setup one for myself a few times (either Grafana or similar such as Netdata, or Linode's Longview) but I've not really seen what it does for me beyond the "your disk is almost full" warnings. |
|
* ZFS pool errors. Motivator: one of my HDDs failed and it took me a few days to notice. The pool (raidz1) kept chugging along of course.
* HDD and SSD SMART errors
* High HDD and SSD temperatures
* ZFS pool utilization
* High CPU temperature. Motivator: one of my case fans failed and it took a while for me to notice.
* High GPU temperatures. Motivator: I have two GPUs in my tower, one of which I don't really monitor (used for transcoding).
* High (sustained) CPU usage. I track this at the server level, rather than for individual VMs.