Hacker News new | ask | show | jobs
by ricardobeat 844 days ago
Exactly. When they say

> Unlike with prometheus, however, with Wide Events approach we don’t need to worry about cardinality

This is hinting at the hidden reason why not everyone does it. You have to 'worry' about cardinality because Prometheus is pre-aggregating data so you can visualize it fast, and optimizing storage. If you want the same speed on a massive PB-scale data lake, with an infinite amount of unstructured data, and in the cloud instead of your own datacenters, it's gonna cost you a lot, and for most companies it is not a sensible expense.

It does work at smaller scale though, we once had an in-house system like this that worked well. Eventually user events were moved to MixPanel, and everything else to Datadog, metrics/logs/traces + a migration to OpenTel. It took months and added 2-digit monthly bills, and in the end debugging or resolving incidents wasn't much improved over having instant access to events and business metrics. Whoever figures out a system that can do "wide events" in a cost-effective way from startup to unicorn scale will absolutely make a killing.

2 comments

https://victoriametrics.com/ would definitely recommend anyone having performance issues with Prometheus to give VictoriaMetrics a try.
Once you plug long term storage onto your Prometheus, do you really need the main Prometheus instances anymore?

Here’s an article about this idea: https://datadrivendrivel.com/posts/rmrfprometheus/

You can substitute the Grafana Agents for OTEL collectors as well.

While Grafana Agent uses less resources than Prometheus, there is more optimized Prometheus-compatible scraper and router exists - vmagent [1]. I'd recommend you giving Grafana Agent and vmagent the same workload and comparing their resource usage.

P.S. Prometheus itself can also act as a lightweight agent, which collects metrics and forwards them to the configured remote storage [2].

[1] https://docs.victoriametrics.com/vmagent/

[2] https://prometheus.io/blog/2021/11/16/agent/

That's just kicking the can over to an object storage API instead of managing disks.
And comes with all the downsides of Prometheus as well
Could you pelase elaborate more on "comes with all the downsides of Prometheus"?
Or, you can use Thanos, the de-facto standard with the biggest OSS community.
Thanos/Mimir community doesn't help to resolve configuration routine or even bigger resource consumption for a huge setup.
Bigger resource consumption of what exactly? Leaf Prometheus instances or the Thanos/Mimir stack compared to VictoriaMetrics? Have you seen a large scale migration between the two, with actual numbers?
A few of interesting real-world large-scale migrations are highlighted at https://docs.victoriametrics.com/casestudies/
Not everything emits wide events. Maybe you can get the entire application layer like that, but there is also value in logs and metrics emitted from the rest of the infra stack.

To be fair, you could probably store and represent everything as wide events and build visualization tools out of that that can combine everything together, even if they are sourced from something else.

Wide events seem to be "structured logs with focused schemas" (maybe also published in a special way beyond writing to stdout) but most places I've worked would call that "logging" not "wide events".

The reasons we don't use them for everything are as others in the thread say: it's expensive. Metrics (just the numbers, nothing else) can be compressed and aggregated extremely efficiently, hence cheaply. Logs are more expensive due to their arbitrary contents.

It's all due to expense really.

Columnar storage stores data very efficiently, too - because it compresses data of a similar nature (columns). Check e.g. ClickHouse on this matter: https://clickhouse.com/docs/en/about-us/distinctive-features, https://clickhouse.com/blog/working-with-time-series-data-an...

So I wouldn't say that events are "expensive" while metrics are "cheap" - both depend on the actual implementation, and events can be cheap too.

And so of course if you have to optimise things, you would need to drop some information you pass to the events, but you would need to do the same for metrics (reduce the number of metrics emitted, reduce the prometheus labels,...).

If you have small pre-defined sets of events in data structures that compress well. That is not the case for any real system.

> And so of course if you have to optimise things, you would need to drop some information you pass to the events, but you would need to do the same for metrics (reduce the number of metrics emitted, reduce the prometheus labels,...).

Those are entirely different orders of magnitude both when it comes to size and how much usefulness you lose. In modern storage backends like Victoriametrics a counter gonna cost you around byte per metric per probe. And as you emit them periodically, that is essentially independent of incoming traffic

Capturing the requests into event/trace/whatever other name they gave to logs this month is many times that and is multiplied by traffic.

> Those are entirely different orders of magnitude both when it comes to size and how much usefulness you lose. In modern storage backends like Victoriametrics a counter gonna cost you around byte per metric per probe. And as you emit them periodically, that is essentially independent of incoming traffic

I thought this argument was about whether wide events can be used for metrics or metrics is a completely different concept. If we want to emulate metrics in events, we would also make them periodically independently of the traffic. Like emit them once in a while. Pretty much like Prometheus scraping works

Storing telemetry efficiently is only part of what Monitoring is supposed to do. The other part is querying: ad-hoc queries, dashboards, alerting queries executed each 15s or so. For querying to work fast, there has to be an efficient index or multiple indexes depending on the query. Since you referred ClickHouse as efficient columnar storage, please see what makes it different from a time series database - https://altinity.com/wp-content/uploads/2021/11/How-ClickHou...
And yet people use ClickHouse quite effectively for this very problem, see the comment here: https://news.ycombinator.com/item?id=39549218

There are also time-series databases out there that are OK with high cardinality: https://questdb.io/blog/2021/06/16/high-cardinality-time-ser...

> And yet people use ClickHouse quite effectively for this very problem

There is no doubt that ClickHouse is a super-fast database. No one stops you from using it for this very problem. My point is that specialized time series databases will outperform ClickHouse.

> There are also time-series databases out there that are OK with high cardinality

So does this blog say that tolerance to cardinality means that QuestDB indexes only one of the columns in the data generated by this benchmark?

TSDBs like Prometheus, VictoriaMetrics or InfluxDB will perform filtering by any of the labels with equal speed, because this is how their index works. Their users don't need to think about the schema or about which column should be present in the filter.

But in ClickHouse and, apparently, in QuestDB, you need to specify a column or list of columns for indexing (the fewer columns, the better). If the user's query doesn't contain the indexed column in the filter - the query performance will be poor (full scan).

See like this happened in another benchmarketing blogpost from QuestDB - https://telegra.ph/No-QuestDB-is-not-Faster-than-ClickHouse-...

The whole point of wide events is recording an arbitrary set of key value pairs. How do you propose storing that in a columnar datastore?
I can't speak for others, but at Honeycomb that's what we do. There's some details in this blog post that might be interesting: https://www.honeycomb.io/blog/why-observability-requires-dis...