|
|
|
|
|
by aprdm
2159 days ago
|
|
Can you expand? As someone who maintains a large-ish prometheus/grafana installation on prems I don't know what we're missing! We have a couple of custom metrics that we developed in prometheus and the OSS plugins/dashboards look great. |
|
It was also mind-blowing how things were integrated. For example. See a slow request? Click into the APM trace. Notice a service on that trace being slow? Click onto it, see what host it was running on. From there, another button pulls up all the Docker containers running on the host in that point in time. The CPU usage is visualized - and, aha! We forgot to set a CPU limit on one of those other jobs.
Debugging issues like that would've been nearly impossible otherwise, and we had more than a few cases of that.