Hacker News new | ask | show | jobs
by 10987654321 1261 days ago
Not sure about lnav but most log aggregation systems support json and logfmt-formatted logs, and there are many standard logging libraries that supports emitting those formats

Of those json is better if you want to be able to do more advanced stuff (nest dictionaries, use lists, ...) and logfmt is better if you want to have human-readable logs without external tools as well, an example line can look like

    msg="Request finished" tag=request_finish status=200 user=brandur@mutelight.org user_id=1234 app=mutelight app_id=1234
Some more info here https://www.brandur.org/logfmt
5 comments

Agree with logfmt. I wrote the logfmter python library: https://github.com/jteppinette/python-logfmter. You can quickly have all of your logs (including 3rd party) converted to this style.
Has anybody written an actual spec for logfmt? I noticed that different implementations handle escaping of quoted strings subtly differently
Thank you! Now when you say json logging, are there any common patterns you can guide me to look at? Just looking around for JSON logging gives me results like [1] which talk about JSONL (object per line, presumably JSON serializer makes sure to not emit literal newlines).

But some other places describe this a bit more liberally. And [2] notes that I should include a time stamp in each log entry which makes sense.

[1]: https://stackoverflow.com/questions/10699953/format-for-writ... [2]: https://www.papertrail.com/solution/tips/8-essential-tips-fo...

In general if you use things like filebeat or promtail to do log ingestion into centralized log search systems (elasticsearch or loki in these examples) they prefer one object per line.

It makes parsing and keeping track of the "state" of the file a lot easier. Say that your application crashes/gets killed halfway through writing a log message / json dict and then gets restarted and appends to the log file. How should the log reader handle that case if it suddenly becomes a valid nested object? And even if it doesn't, should it throw away the first new log message as well because that was embedded in the invalid json object? Much easier to just say "one line is one json object, if there are literal newlines that's the delimeter to start a new parse".

And yes in any case it's good to have a timestamp on your log message no matter the format, unless you're logging somewhere you know that it gets added immediately (like the systemd journal). Your log parser/forwarder can add a timestamp for when it reads your log message but that is not necessarily the same as when your application emits it.

Another benefit with json/logfmt that bears mentioning explicitly: it has structure.

This means that you shouldn't just write (to reuse the previous example):

    msg="Request for brandur@mutelight.org finished with status 200"
you should do it like

    msg="Request finished" status=200 user=brandur@mutelight.org
and not put any variables into the msg key (and not really do advanced formatting for any of the keys for that matter). This way once you get it put into a log system that understands your format you can do searches like "all log messages where user=foo" or "all statuses that are >=500 and <600" or search on specific messages, all without having to craft elaborate regular expressions and with better performance since the log search system can do indexing and various optimizations so that it doesn't have to be a full-text search every time.
lnav does support JSON-lines and logfmt logs. For JSON-lines, it will pretty-print the log messages to make them human readable.

For logfmt, I seem to remember the spec not being very clear on quoting semantics (maybe I'm wrong). Anyhow, I would suggest using JSON since it has pretty broad support at this point.

Hmmm, it doesn't say in https://lnav.org/features#automatic-log-format-detection but i see that at least json (and xml) is mentioned under the pretty-printing header.

You would know what is supported what with you being the author, just saying that the docs aren't super clear from a quick glance :).

And yes, I second the suggestion to focus on JSON. The main benefit of logfmt is that it's simpler for a human to parse directly but in general you probably shouldn't aim for that so..

> Hmmm, it doesn't say in https://lnav.org/features#automatic-log-format-detection but i see that at least json (and xml) is mentioned under the pretty-printing header.

Yes, I should mention it on the features page. It's currently only mentioned in the main docs:

https://docs.lnav.org/en/latest/formats.html

You can use logfmt with Serilog on dotnet too:

https://github.com/serilog-contrib/Serilog.Logfmt