| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by halfmatthalfcat 2143 days ago

Prometheus and Grafana are awesome, use them personally for all my monitoring.

However I’m still trying to nail down my high cardinality/highly unique metrics-like data story. What are people using?

I’ve heard a combination of Cassandra/BigTable and Spark as a potential solution?

7 comments

latchkey 2143 days ago

I found this interesting. My plan is to move from Prom to Victoria.

https://medium.com/@valyala/measuring-vertical-scalability-f...

link

akulkarni 2143 days ago

Just a heads up, this is an old comparison (over 1 year ago) that hasn't been updated since TimescaleDB now supports native compression. (Blog post references TimescaleDB 1.2.2, the product is now on 1.7.2).

link

sagichmal 2143 days ago

Woof, good luck. Not a great product.

link

PerusingAround 2143 days ago

Care to elaborate? At least a slight mention of why.

link

akulkarni 2143 days ago

TimescaleDB is a long-term storage option for Prometheus metrics, has no problem with high-cardinality, and now natively supports PromQL (in addition to SQL) [0]

(Disclaimer: I work at Timescale)

[0] https://github.com/timescale/timescale-prometheus

link

chinhodado 2142 days ago

I'm just starting to look into this and have a question. If I can export my metrics directly to TimescaleDB and it supports visualization with Grafana, is there any reason to go through Prometheus?

link

akulkarni 2142 days ago

Good question. The advantage of Prometheus is the ability to scrape from a variety of endpoints (seems like more and more things are exposing the Prometheus format).

There are some who write metrics directly to TimescaleDB, while others prefer going through Prometheus to take advantage of that ecosystem.

Best part: We support both!

link

Legogris 2143 days ago

I'd be curious to hear if anyone has done serious evaluation of high-cardinality use-cases of Victoriametrics.

link

chucky_z 2143 days ago

I went from an Influx getting crushed to VM running in a container with 1/8th the resources and it works fine, 1.5m active cardinality. Could handle a lot more probably. Auto fill in Grafana breaks but oh well!

link

jdub 2143 days ago

Honeycomb https://honeycomb.io/ is laser focused on this stuff. They built their own datastore (similar to Druid but schemaless) so they could create the experience they were aiming for.

They talk a lot about collaborative troubleshooting, and the user interface reflects that. It's actually fun (?!) to drill down from heatmaps to individual events with Honeycomb's little comparison charts lighting the way.

link

Nihilartikel 2143 days ago

I've used druid.io in the past and it had worked well, but it's a lot of trouble to set up and tune.. Haven't tried it, but clickhouse looks good and has approximate aggregations for high cardinality dimensions.

link

jpgvm 2143 days ago

Druid truly is still king in this space. The setup has become less onerous over time. It handles arbitrarily high cardinality and dimensionality with ease and its support for sketching algorithms leaves other similar systems (especially Prometheus) in the dust.

link

jrott 2143 days ago

Spark has worked decently for me if you need to be cloud agnostic.

Currently I’m in AWS land and Athena has been mostly working for what I need but I haven’t really pushed it that hard yet.

link

sagichmal 2143 days ago

Just curious what your numbers are? Unique metrics, cardinality per metric, ingest rate, expected query ranges?

link

base698 2143 days ago

I have an instance that scrapes about 30K targets for 15 million metrics and it works better than you'd expect. The biggest performance issue we have is rendering the targets page.

We have a plan to split it down to less instances per node but it's worked well enough so far.

link