Hacker News new | ask | show | jobs
by jeady 4063 days ago
I think the author is conflating several problems here. There are several ways logs can be used, and efficiency is a scale. For example, if I receive a bug report, I like to be able to locate the textual logs from when the incident occurred and actually just sit and read what was happening at the time. On the other hand, if I'm doing higher-level analysis such as what features do users use most, clearly it's more efficient to have some sort of structure format because you're interested in the logs in aggregate. The author makes it sound like they're advocating optimizing for the aggregate use case at the expense of other use cases. I think that the declaration that textual logs are terrible is an oversimplification of the considerations in play.

Also, if the author has a 5-node cluster producing 100Gbs of logs a day, the logs may also be too verbose or poorly organized. I work on a system that produces 100s of Gbs of logs a day but with proper organization they're perfectly manageable.

I think that a more nuanced solution is to log things that are useful to manual examination in text form, but high-frequency events that are not particularly useful could reasonably be logged elsewhere (e.g. a database or binary log that is asynchronously fed into a database).

In conclusion, as is frequently the case with engineering, I think the author oversimplifies the problem here and tries to present a one-size-fits-all solution instead of taking a more pragmatic solution. Textual logs are useful when meant for human consumption (debugging) and when they can be organized such that the logs of interest at any time are limited in size, and some other binary-based format is useful for aggregate higher-level analysis.

1 comments

With a binary log storage system, nothing stops you from browsing all logs that happened around the time of the incident. Instead of locating the files, you just tell the engine to show you the logs from that time onwards (or from a little bit before).

As for our logs being too verbose: nope, read the article.

Also, it's not an one-size-fits-all solution: I have no problem with people using text. All the article wants to show, is that binary logs are not evil, bad, useless, etc, and that there are actually very good reasons to use them.

For example, storing logs in a database is one kind of binary log storage: most databases don't store the data as text.