|
|
|
|
|
by vulkoingim
209 days ago
|
|
Have a look at Victoria Metrics - have run it at a relatively high scale with much more success than any other metric stores. It's one of those things that just work. It's extremely easy to run at in a single-instance mode and handles much more than you would expect. Scaling it is a breeze too. (I'm not affiliated, but a very happy user across multiple orgs and personal projects) |
|
All of these systems that store metrics in object storage - you have to remember that object storage is not file storage. Generally speaking (stuff like S3 One Zone being a relatively recent exception) you cannot append to object files. Metrics queries are resolved by querying historical metrics in object storage plus a stateful service hosting the latest 2 hours of data before it can be compressed and uploaded to object storage as a single block. At a certain scale, you simply need to choose which is more important - being able to answer queries or being able to insert more timeseries. And if you don't prioritize insertion, it just results in the backlog getting bigger and bigger, which especially in the eventual case (Murphy's Law guarantees it) of a sudden flood of metrics to ingest will cause several hour ingestion delays during which you are blind. And if you do prioritize insertion, well the component simply won't respond to queries, which makes you blind anyway. Lose-lose.
Mimir built in Kafka because it's quite literally necessary at scale. You need the stateful query component (with the latest 2 hours) to prioritize queries, then pull from the Kafka topic on a lower priority thread, when there's spare time to do so. Kafka soaks up the sudden ingestion floods so that they don't result in the stateful query component getting DoS'd.
I took a quick look at VictoriaMetrics - no Kafka or Kafka-like component to soak up ingestion floods? DOA.
Again, most companies are not BigCos. If you're a startup/scaleup with one VP supervising several development teams, you likely don't need that scale, probably VictoriaMetrics is just fine, you're not the first person I've heard recommend it. But I would say 80% of companies are small enough to be served with a simple Prometheus or Thanos Query over HA Prometheus setup, 17% of companies will get a lot of value out of Victoria Metrics, the last 3% really need Mimir's scalability.