Hacker News new | ask | show | jobs
by corytheboyd 1036 days ago
Structured logging is such an easy to gain, massive improvement to observability. Assuming you can pay for the log processor to make sense of it all :)

I’ve been working on a side project to bring something like the DataDog log explorer to the local development environment. The prototype I made has already been extremely helpful in debugging issues in a very complex async ball of Rails code. Does something like that sound useful to other folks? Does it already exist and I just can’t find it?

3 comments

> Does something like that sound useful to other folks?

Very much so!

One piece I'd like to have is a good output format from my program. Right now I have stuff outputting logs in JSON format to stderr/local file (then picked up by the Opentelemetry OTLP collector, Datadog agent, AWS CloudWatch, whatever) and the traces/spans actually sent from my process to the collector over the network. It baffles me why the traces/spans are done in that way rather than placed in a local file (and for compressibility, ideally the same file as the logs). The local file method would make it easier to have simple local-focused tools, would lower the requirements for outputting these records (no need to have async+http client set up, just the writer thread), and would better handle the collector being in a bad mood at the moment.

That's the most important piece, but if getting that done requires inventing a new format (it seems to?!?), there are some other details I'd like to it to do well. You should be able to see all the stuff associated with each log entry's matching span, which means that spans and attributes should be recorded on open, not just on close. (Attributes can also be updated mid-span; the Rust tracing library allows this.) (OpenTelemetry notably only supports complete spans). It could be more efficient by interning the keys. some care about how rotation is handled, timestamps are handled, etc.

I hadn't even considered collecting traces/spans in this way yet, and have taken the approach of "stuff outputting logs in JSON format to stderr/local file". I usually end up writing a (temporary, structured) log message with the relevant span tags, but wouldn't it be much better to run the actual trace/span code and be able to verify it locally without the ad-hoc log message?

The prototype I built is a web application that creates websocket connections, and if those connections receive messages that are JSON, log lines are added. Columns are built dynamically as log messages arrive, and then you can pick which columns to render in the table. If you're curious here's the code, including a screenshot: https://github.com/corytheboyd-smartsheet/json-log-explorer

With websockets, it's very easy to use websocketd (http://websocketd.com), which will watch input files for new lines, and write them verbatim as websocket messages to listeners (the web app).

To make the idea real, would want to figure out how to not require the user to run websocketd out of band, but watching good ol' files is dead simple, and very easy to add to most code (add a new log sink, use existing log file, etc.)

The Grafana stack, including Loki for logs, can be started easily in a docker-compose project in your local environment. Not that Grafanas log viewer is the best in class but it’s at least something. For terminal fans there is a logcli.

If you are concerned with cost of the log processor, Loki again has your back by being very easy and lightweight to deploy. Giving you same tools in dev as in prod.

Have you checked lnav?
I’ve seen it mentioned before, but I haven’t given it a demo yet. To be honest I wasn’t sure if TUI was the right UI for something like this, as it can’t pull off all of the things a GUI can. It’s not that I have a TUI allergy either (I can’t live without lazygit).
There are certainly limits on what can be displayed and done interactively due to the low resolution. Especially, when it comes to charting data. Some visualizations are available within lnav, mostly bar charts and the like. What type of things are you trying to do that you think are not possible within a TUI?
lnav is neat but doesn't really do structured logging AFAICT, just predetermined fields for the format.
lnav has support for JSON-lines, logfmt, as well as the Bro and W3C Extended Log File formats that are XSV and self-describing. The contents are also accessible through SQLite tables. Is there some gap here that you're thinking of?
The idea of structured logs is that every place in the code where you define a log message, you can throw in extra attributes (key/value) pairs.

As far as I can tell, the lnav feature you describing allows you to define a JSON log format with predefined fields. But if some log message uses an attribute you haven't anticipated in the format definition, there's no way to see it in the pretty-printed output or filter on it in the UI and no ability to see it in the SQLite virtual table. That's why I say lnav doesn't appear to support structured logs.

edit: oh, I missed `"hide-extra": false`! That significantly improves things! Still, I don't see a way to access it from the SQLite virtual table. I also don't see something else I might want: a way to have a different "view" for certain log messages; maybe to switch between filtering/viewing particular ones, maybe to just have line-format be conditional based on the detected format. (I guess I can sort of do this based on `module-field`? but I might want it lighter-weight/finer-grained than that.)

> But if some log message uses an attribute you haven't anticipated in the format definition, there's no way to see it in the pretty-printed output

Properties in the log message that are not in the "line-format" are displayed below the message as key-value pairs. As an example, the bunyan[1] log format[2] has a few standard properties that are used in the "line-format" to form the main message. Then, as shown in this test[3] for this log[4], the remaining properties (even ones not mentioned in the format file) are shown underneath.

> or filter on it in the UI

Since they are part of the message as mentioned above, they can be filtered on.

> and no ability to see it in the SQLite virtual table.

The "log_raw_text" column in the table can be used to access the original log message from the file. So, you can use the JSON functions in SQLite to retrieve the value:

    ;SELECT log_raw_text ->> '$.repository' from bunyan

[1] - https://github.com/trentm/node-bunyan

[2] - https://github.com/tstack/lnav/blob/master/src/formats/bunya...

[3] - https://github.com/tstack/lnav/blob/master/test/expected/tes...

[4] - https://github.com/tstack/lnav/blob/master/test/logfile_buny...

Yeah it's essential to have a viewer that deals natively with structured logs.

I'm iterating on a log pretty printer that accepts structured logs in a pipe and does things like color coding, adding terminal-recognized vscode:// hyperlinks for call stacks, smart wrapping based on the terminal width, and special formatting for panics and stuff.

NCurses is probably coming in a couple months.

Does anything like this already exist?