| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mbell 1090 days ago

We used InfluxDB back in the 0.8/0.9 days and it worked really well, scaled nicely with the large number of metrics we were storing.

The switch to a tag based architecture in 1.0 completely broke the database for our use case, it could no longer handle large metric cardinality. Things improved a bit around 1.2, but never got back to something usable for us.

We ultimately moved to using clickhouse for time series data and haven't had to think about it since.

Where is influx at now? Can they handle millions of metrics again? What would bring us back?

3 comments

pauldix 1090 days ago

InfluxDB 3.0 is built around a columnar query engine (Apache DataFusion) with data stored in Parquet files in object storage. Eliminating cardinality concerns was one of the top drivers for creating 3.0. I mention some of the other big things we wanted to achieve in some other comments in this HN thread.

InfluxDB 3.0 is optimized for ingestion performance and data compression, paired with a fast columnar query engine. So we can ingest with fewer CPUs, less RAM and reduce storage cost because it's all compressed and put into object store. And we support SQL now (in addition to InfluxQL) with fast analytic queries.

We don't have open source releases yet (that's for later this year), but we have it available in the cloud as a multi-tenant product or dedicated clusters.

link

zX41ZdbW 1089 days ago

It sounds like reimplementing ClickHouse, but worse. Sorry for pointing this out, but isn't it?

link

cannonpalms 1089 days ago

It actually sports much better ingest performance versus ClickHouse, and query performance is close and improving.

link

zX41ZdbW 1088 days ago

This is a bold claim.

link

ilyt 1090 days ago

We were stuck on 1.x for a long time. Downsampling seemed to be eternally broken (or rather not performant enough) regardless of versions so we wrote our own downsampler doing it on ingestion (in riemann.io).

And as world seemed to converge on Prometheus/prometheus-compatible interfaces we will probably eventually migrate to VictoriaMetrics or something else "talking prometheus"

InfluxQL was shit. Flux looks far more complex for 90%+ things we use PromQL for now so it is another disadvantage. I'm sure it's cool for data science but all we need to do is to turn some things to rate and do some basic math or stats on it.

> Can they handle millions of metrics again? What would bring us back?

we had one instance with ~25 mil distinct series eating around 26 GB RAM. I'd suggest looking into VictoriaMetrics. Mimir is a bit more complicated to run and seems to require far more hardware for similar performance, but has distinction (whether that's advantage or not, eh...) of using object store instead of plain old disk which makes HA a bit easier.

link

ithkuil 1090 days ago

Yes, influxdb 3.0 uses a new columnar store engine (IOx) that offers "unbounded cardinality". See more at https://www.influxdata.com/blog/intro-influxdb-iox/

(Disclaimer: I work at InfluxData)

link