|
|
|
|
|
by acid__
2167 days ago
|
|
From my perspective at a smaller startup (so YMMV) after switching to Datadog; the first thing I noticed was how _fast_ Datadog was. The queries and graphing capabilities were also really powerful. Or maybe I just didn't know how to use the old tools, but regardless it was super easy to pick up and do things I struggled with in Prometheus/Grafana. It was also mind-blowing how things were integrated. For example. See a slow request? Click into the APM trace. Notice a service on that trace being slow? Click onto it, see what host it was running on. From there, another button pulls up all the Docker containers running on the host in that point in time. The CPU usage is visualized - and, aha! We forgot to set a CPU limit on one of those other jobs. Debugging issues like that would've been nearly impossible otherwise, and we had more than a few cases of that. |
|
As far as speed, I haven't had the issue with prometheus. We use recorded rules for things that benefit from being pre-computed.
I imagine the UX to be quite different by using a product.