Hacker News new | ask | show | jobs
by AdieuToLogic 331 days ago
> In the end, what's the difference between a log and a metric?

Essentially, a log entry is the emission of state known by an individual code execution path at the point the log entry can be produced, whereas a metric is a measurement of a specific runtime execution performed by the system.

For example, a log entry of:

  module_logger.info(
    f"Processed {num_events_processed_since_last_log} events."
    )
Emits a log entry capturing the processing state known when the statement is evaluated. What it does not do is separate this information (a time-based attribute in this case) from other log entries, such as "malformed event detected" or "database connection failed."

More importantly, putting metrics into log entries forces timing to include log I/O, requires metrics analysis systems to parse all log entries, and limits the type of metrics which can be reported to be those expressible in a message text field.

Maybe most important of all, however, is that metrics collection and reporting is orthogonal to logging. So in the example above, if the log level were set to "error", then there would be no log-based metric emitted.

2 comments

This is a reasonable first pass answer, but there's more nuance to this...

> What it does not do is separate this information

Logging at scale should really be structured, which means that you can trivially differentiate between different types of log message. You also get more dimensions all represented in that structure.

> limits the type of metrics which can be reported to be those expressible in a message text field

This is another example, logging shouldn't be text based ideally. You might have a summary human readable field, but metrics can easily be attributes on the log message.

The more I work in this area the more I'm realising that logs and metrics are pretty interchangeable. There are trade-offs for each absolutely, but you can convert logs into metrics easily (Datadog does this), and with a bit more effort you could turn a metric into logs if you wanted to (querying metrics as rows in a SQL database is handy!).

Metrics collection is also not necessarily orthogonal to logging, it depends on your system. From a server, you might have logs pushed to an external source and metrics pulled from the server by Prometheus, but that's just implementation details. You can also have logs pulled from log files, and metrics pushed to a statsd endpoint.

I've worked on mobile apps where metrics get aggregated locally and then pushed as log events to the server with one log event per metric and dimension set, only for the server to then typically turn them back into metrics.

It's good to understand the tradeoffs, the technology, whether you're using push or pull, where data is spooled or aggregated, data costs, etc. But this stuff is all pretty malleable and there's often no clearly right answer.

I think what you're saying is that you can make a logging system LARP metrics. At the end it's logging on fd 1 and 2 and metrics are usually over http, but ofc you can dump "metrics" into stout, it's not as practical with what tools are built for what.

In my local fun projects that run on my machine I might dump metrics into the logs because it's practical, but it doesn't make it "right".

I log over RPC and send metrics over RPC, thinking about logs being to a file descriptor and metrics being over HTTP is focusing too much on a particular implementation and not enough on the concepts.

Also it's not about logs role playing as metrics, I'm saying you can literally turn one into the other, in both cases, and there are valid use cases for that.

You shouldn’t use f-strings with logging.
I know and understand the reasons for that rule, but it’s one of the first ones I disable in linters. The theoretical benefits in the context of the systems I work on aren’t worth the extra friction.