Hacker News new | ask | show | jobs
by zulu-inuoe 1261 days ago
Thank you! Now when you say json logging, are there any common patterns you can guide me to look at? Just looking around for JSON logging gives me results like [1] which talk about JSONL (object per line, presumably JSON serializer makes sure to not emit literal newlines).

But some other places describe this a bit more liberally. And [2] notes that I should include a time stamp in each log entry which makes sense.

[1]: https://stackoverflow.com/questions/10699953/format-for-writ... [2]: https://www.papertrail.com/solution/tips/8-essential-tips-fo...

2 comments

In general if you use things like filebeat or promtail to do log ingestion into centralized log search systems (elasticsearch or loki in these examples) they prefer one object per line.

It makes parsing and keeping track of the "state" of the file a lot easier. Say that your application crashes/gets killed halfway through writing a log message / json dict and then gets restarted and appends to the log file. How should the log reader handle that case if it suddenly becomes a valid nested object? And even if it doesn't, should it throw away the first new log message as well because that was embedded in the invalid json object? Much easier to just say "one line is one json object, if there are literal newlines that's the delimeter to start a new parse".

And yes in any case it's good to have a timestamp on your log message no matter the format, unless you're logging somewhere you know that it gets added immediately (like the systemd journal). Your log parser/forwarder can add a timestamp for when it reads your log message but that is not necessarily the same as when your application emits it.

Another benefit with json/logfmt that bears mentioning explicitly: it has structure.

This means that you shouldn't just write (to reuse the previous example):

    msg="Request for brandur@mutelight.org finished with status 200"
you should do it like

    msg="Request finished" status=200 user=brandur@mutelight.org
and not put any variables into the msg key (and not really do advanced formatting for any of the keys for that matter). This way once you get it put into a log system that understands your format you can do searches like "all log messages where user=foo" or "all statuses that are >=500 and <600" or search on specific messages, all without having to craft elaborate regular expressions and with better performance since the log search system can do indexing and various optimizations so that it doesn't have to be a full-text search every time.