Hacker News new | ask | show | jobs
by tayo42 3622 days ago
Datadog was down so often when I had to use it. It felt so unreliable. We used to monitor hosts and it got to the point where checking if datadog was down was part of troubleshooting..
1 comments

How long ago was that? We just switched in January and it's been pretty reliable.

As far as Datadog goes, it's the most team friendly dashboard system we've used. We had a specialty monitoring system for one application stack previously, and no one made custom dashboards there or even just looked at the data. Now we've got custom dashboards out the nose and we're gradually consolidating to a "best of" dashboard for each service.

Datadog may be okay if you're doing really simple stuff and not sending much data. Once you get to scale, you will need a system like Wavefront. Wavefront can take millions of data points per second, query on them super fast, and they don't go down. Every other monitoring system downsamples, or throws away your data after a certain amount of time.
We throw hundreds of thousands of metrics at Datadog per minute from thousands of hosts; it hasn't broken a sweat yet.

Are you affiliated with Wavefront?

I'm a happy customer of Wavefront. I completely believe that Datadog can handle hundreds of thousands per minute -- especially if most of them are pre-canned, non-custom metrics grabbed by their agent. Hundreds of thousands of metrics/minute is a few thousand a second only. Wavefront does millions of custom metrics per second, which can be sent with different dimensions and tags. That's much harder.
Yeah I second this. Hundreds of thousands per minute is a very small number. Wavefront can go over a million per second.
Per tenant, or globally? And does it have an SLA around this?

(I've learned not to trust numbers that seem too good to be true unless they're contractually obligated.)

The ui is pretty nice for sure, especially compared to some other things im using.

I used it up until two months ago when I left that job. There was 2000~ servers monitored I think.