Hacker News new | ask | show | jobs
by endymi0n 3697 days ago
We're extremely happy campers here combining the Grafana dashboard with https://prometheus.io/ Datastore and Alerts. That project has some serious traction and is one of just a few that seems actually built for the cloud and distributed systems first, as it's primarily role and not host based.
3 comments

Can I ask how you manage logs?

My minimal understanding is that prometheus is time-series only so you'd have to supplement with something like ElasticSearch to aggregate logs. Does this mean you are alerting only on metrics or have multiple alert systems or ...?

We run Grafana for production dashboards with a KairosDB (we are a C* shop) and use ELK for text logs. Grafana can add annotations from ElasticSearch, but beyond that we are looking at our ES alerting options.
Interesting. I briefly looked at it previously but have never had the chance to play around.

Did you (or has anyone else) migrate your legacy data from graphite to prometheus? What's the grafana support like on top? Do you need to run a proxy or something to support the same querying or does it require reworking all the existing graphs?

Prometheus developer here.

> Did you (or has anyone else) migrate your legacy data from graphite to prometheus?

Prometheus isn't intended as a long-term data store, so there's usually not much point. The data model is also quite different.

> What's the grafana support like on top?

Grafana supports Prometheus as a 1st class integration, and when I was speaking with the Raintank team in person they are very supportive of Prometheus.

> Do you need to run a proxy or something to support the same querying or does it require reworking all the existing graphs?

The data model and query language is more powerful and very different, you have to redo all the graphs.

To aid transition we have https://github.com/prometheus/graphite_exporter which will take in graphite formatted exports from your clients, and convert it to the format Prometheus likes.

The one thing that holds me back is the pull nature of Prometheus. We are pushing metrics already, so moving to a pull model feels like a return to the 20th century.
It actually tends to work a lot better. You almost never actually need metric granularity so fine that push is necessary. If you do, your usually using a purpose built tool for that. It also saves you from accidentally DOSing yourself when someone starts unexpectedly emitting more metrics than you can ingest (due to bug, more traffic, etc). All doable with push based systems but they tend to end up using queues or something to compensate which is its own kind of pain.
Actually, you probably want to have both. Push is great when you are collecting results of some operations, otherwise you would have to save data somewhere until it is fetched.
The https://github.com/prometheus/pushgateway serves those cases, primarily service-level batch jobs.