Hacker News new | ask | show | jobs
by mbell 1090 days ago
We used InfluxDB back in the 0.8/0.9 days and it worked really well, scaled nicely with the large number of metrics we were storing.

The switch to a tag based architecture in 1.0 completely broke the database for our use case, it could no longer handle large metric cardinality. Things improved a bit around 1.2, but never got back to something usable for us.

We ultimately moved to using clickhouse for time series data and haven't had to think about it since.

Where is influx at now? Can they handle millions of metrics again? What would bring us back?

3 comments

InfluxDB 3.0 is built around a columnar query engine (Apache DataFusion) with data stored in Parquet files in object storage. Eliminating cardinality concerns was one of the top drivers for creating 3.0. I mention some of the other big things we wanted to achieve in some other comments in this HN thread.

InfluxDB 3.0 is optimized for ingestion performance and data compression, paired with a fast columnar query engine. So we can ingest with fewer CPUs, less RAM and reduce storage cost because it's all compressed and put into object store. And we support SQL now (in addition to InfluxQL) with fast analytic queries.

We don't have open source releases yet (that's for later this year), but we have it available in the cloud as a multi-tenant product or dedicated clusters.

It sounds like reimplementing ClickHouse, but worse. Sorry for pointing this out, but isn't it?
It actually sports much better ingest performance versus ClickHouse, and query performance is close and improving.
This is a bold claim.
We were stuck on 1.x for a long time. Downsampling seemed to be eternally broken (or rather not performant enough) regardless of versions so we wrote our own downsampler doing it on ingestion (in riemann.io).

And as world seemed to converge on Prometheus/prometheus-compatible interfaces we will probably eventually migrate to VictoriaMetrics or something else "talking prometheus"

InfluxQL was shit. Flux looks far more complex for 90%+ things we use PromQL for now so it is another disadvantage. I'm sure it's cool for data science but all we need to do is to turn some things to rate and do some basic math or stats on it.

> Can they handle millions of metrics again? What would bring us back?

we had one instance with ~25 mil distinct series eating around 26 GB RAM. I'd suggest looking into VictoriaMetrics. Mimir is a bit more complicated to run and seems to require far more hardware for similar performance, but has distinction (whether that's advantage or not, eh...) of using object store instead of plain old disk which makes HA a bit easier.

Yes, influxdb 3.0 uses a new columnar store engine (IOx) that offers "unbounded cardinality". See more at https://www.influxdata.com/blog/intro-influxdb-iox/

(Disclaimer: I work at InfluxData)