|
|
|
|
|
by kiitos
552 days ago
|
|
Metrics can be sampled, because metrics by definition represent some statistically summarize-able set of observations of related events, whose properties are preserved even when those observations are aggregated over time. That aggregation doesn't result in loss of information. At worst, it results in loss of granularity. That same key property is not true for logs. Logs cannot be aggregated without loss of information. By definition. This isn't up for debate. You can collect logs into groups, based on similar properties, and you can decide that some of those groups are more or less important, based on some set of heuristics or decisions you can make in your system or platform or whatever. And you can decide that some of those groups can be dropped (sampled) according to some set of rules defined somewhere. But any and all of those decisions result in loss of information. You can say that that lost information isn't important or relevant, according to your own guidelines, and that's fine, but you're still losing information. |
|
A slightly more complicated rules is: drop your logs with increasing probability as they age.
With any rule, you will lose information. Yes. Obviously. Sampling metrics also loses information.
The question is what's the best (or a good enough) trade-off for your use case.