Hacker News new | ask | show | jobs
by Zenfinch 3622 days ago
If you can have a monitoring system in the cloud Datadog is a great choice.

Good documentation, UI, many, many plugins and fair pricing (IMO).

https://www.datadoghq.com/

(Im not affiliated with in any way other than using their product on a pet project with many moving parts).

2 comments

Datadog was down so often when I had to use it. It felt so unreliable. We used to monitor hosts and it got to the point where checking if datadog was down was part of troubleshooting..
How long ago was that? We just switched in January and it's been pretty reliable.

As far as Datadog goes, it's the most team friendly dashboard system we've used. We had a specialty monitoring system for one application stack previously, and no one made custom dashboards there or even just looked at the data. Now we've got custom dashboards out the nose and we're gradually consolidating to a "best of" dashboard for each service.

Datadog may be okay if you're doing really simple stuff and not sending much data. Once you get to scale, you will need a system like Wavefront. Wavefront can take millions of data points per second, query on them super fast, and they don't go down. Every other monitoring system downsamples, or throws away your data after a certain amount of time.
We throw hundreds of thousands of metrics at Datadog per minute from thousands of hosts; it hasn't broken a sweat yet.

Are you affiliated with Wavefront?

I'm a happy customer of Wavefront. I completely believe that Datadog can handle hundreds of thousands per minute -- especially if most of them are pre-canned, non-custom metrics grabbed by their agent. Hundreds of thousands of metrics/minute is a few thousand a second only. Wavefront does millions of custom metrics per second, which can be sent with different dimensions and tags. That's much harder.
Yeah I second this. Hundreds of thousands per minute is a very small number. Wavefront can go over a million per second.
The ui is pretty nice for sure, especially compared to some other things im using.

I used it up until two months ago when I left that job. There was 2000~ servers monitored I think.

Just make sure you understand their pricing (I didn't): http://sagemath.blogspot.com/2016/07/datadogs-pricing-dont-m...
How could their pricing page be clearer? It says per host in fairly large letters underneath it.

I'm asking because I will be designing a similar page soon (that's also billed per host) and I'd like to avoid the same mistakes.

[EDIT: This pricing page by the top poster in this thread is way better than I suggest below -- https://www.serverdensity.com/pricing/]

1. VERY clearly state that when you sign up for the service, then you are on the hook for up to $18*500 = $9000 + tax in charges for any month. Even Google compute engine (and Amazon) don't create such a trap, and have a clear explicit quota increase process.

2. Instead of "HUGE $15" newline "(small light) per host", put "HUGE $18 per host" all on the same line. It would easily fit. I don't even know how the $15/host datadog discount could ever really work, given that the number of hosts might constantly change and there is no prepayment.

3. Inform users clearly in the UI at any time how much they are going to owe for that month (so far), rather than surprising them at the end. Again, Google Cloud Platform has a very clear running total in their billing section, and any time you create a new VM it gives the exact amount that VM will cost per month.

4. If one works with a team, 3 is especially important. The reason that I had monitors on 50+ machines is that another person working on the project, who never looked at pricing or anything, just thought -- he I'll just set this up everywhere. He had no idea there was a per-machine fee.