Hacker News new | ask | show | jobs
by chipdart 734 days ago
> This is my main gripe too. I don't understand why {traces, logs, metrics} are not just different abstractions built on top of "events" (blobs of data your application ships off to some set of central locations).

By design, they cannot be abstractions of the single concept. For example, logs have a hard requirement on preserving sequential order and session and emitting strings, whereas metrics are aggregated and sampled and dropped arbitrarily and consist of single discrete values. Logs can store open-ended data, and thus need to comply with tighter data protection regulations. Traces often track a very specific set of generic events, whereas there are whole classes of metrics that serve entirely different purposes.

Just because you can squint hard enough to only see events being emitted, that does not mean all event types can or should be treated the same.

1 comments

> Just because you can squint hard enough to only see events being emitted

If you squint hard enough you can fool yourself into thinking all metrics have the same availability requirements. It’s not the case. There are plenty of time series data metrics where arbitrarily dropping them or aggregating them would throw off your alerting entirely.

Indeed one would have to squint to the point of blindness.

Logs are single point in time, flat, linear sequence, never dropped (at best you'd collapse sequences of identical, repeated logs). Think dmesg, syslog, systemd journald/journalctl.

Metrics are statistical numeric data, which can be series, average, histogram, bucket... aggregation/reduction can be done on the fly/before leaving the observed thing. Some can be dropped, but it is important that dropping anything stays statistically meaningful.

Spans are a duration in time representing some operation, with metadata (numeric, stringy, structured even) attached pertaining to that operation. Spans have a parent, forming a tree, which forms a trace. Spans can be deduped and/or sampled, with specific occurences forcefully kept (e.g 500 error) or dropped (e.g healthcheck).

They are fundamentally different (technical) primitives a.k.a (functional) tools to observe different things and serve different goals.

Right, the point I’m making is logs, metrics, traces, these concepts are views of data, with a pretty hazy relationship to the shape of the data itself or the handling requirements. Any assumption you make about them as a category (logs are unstructured, traces are sampled, metrics can be aggregated) is wrong nearly as much as it’s right.
> Right, the point I’m making is logs, metrics, traces, these concepts are views of data (...)

Not really. Logs are fundamentally different than operational metrics, which are fundamentally different than business/behavioral metrics, which are fundamentally different than traces, etc etc etc.

This is not a matter of "view". It's the result of completely different system requirements. They are emitted differently, they are processed/aggregated differently, they are stored differently, they are consumed differently.

Even within business metrics types, which is already a specialized type of metrics, you have fundamentally different system requirements. Click stream metrics mix traits of tracing with logging and metrics, and have very specific requirements regarding data protection.

They are all distinct observability features. They are not the same. At all. This is not up for debate.

> Click stream metrics mix traits of tracing with logging and metrics

This sounds like you are admitting my point? My point is not “there is no difference between anything” my point is that the 3 buckets of “metrics, logs, traces” are neither all-encompassing in terms of types of telemetry one might feasibly want to emit, nor are they mutually exclusive. Here is perhaps a better writeup of what I mean

https://open.substack.com/pub/isburmistrov/p/all-you-need-is...

> If you squint hard enough you can fool yourself into thinking all metrics have the same availability requirements.

I'm sorry, I have no idea what point you tried to make.