Hacker News new | ask | show | jobs
by leni536 4064 days ago
I don't have experience with binary logs. I think the fragility of binary logs is not baseless though. AFAIK there was (is?) a problem in systemd's journal where a local corruption of the log could cause a global unavailability of the logged data.

People like text logs because local corruptions remain local. Some lines could be gibberish, but that's all. I'm not suggesting that this couldn't be done with binary logs, but you have to carefully design your binary logging format to keep this property.

Otherwise I agree with the author that we shouldn't be afraid of binary formats in general, we need much more general formats and tools though (grep, less equivalents).

I'm not fond of "human readable" tree formats like XML or JSON either. bencode could be equally "human readable" as an utf-8 text if one has a less equivalent for bencode.

2 comments

> I don't have experience with binary logs. I think the fragility of binary logs is not baseless though. AFAIK there was (is?) a problem in systemd's journal where a local corruption of the log could cause a global unavailability of the logged data.

From my experience (I do not want to troll and presume you have not tried it), systemd starts off where it picked up when an old log is corrupted and stars a new one. There is a command line utility to verify the integrity of these files (on my Windows laptop at work, cannot check). Now, I am not sure the state of log file repair. I was told it is not possible. However, it seems this means the file is corrupted in a way it is not easily indexed. It is likely it is still readable. I wish I had seen this last time.

https://www.reddit.com/r/linux/comments/1y6q0l/systemds_bina...

Granted, I use Arch Linux on an old laptop. I had these corruptions routinely happen when I had disabled ACPI controls (I do not use the fancy WMs, I am back to Ratpoision) and completely, and I mean completely drained the battery until it came crashing to a halt). So, I am not surprised about these corruptions.

Anyone using systemd boxes in production who can comment on this? Flamewar or not, I would like to know more. I do not really care for it one way or the other. Parts I like, parts I do not.

I was thinking exactly the same, once you want to create a binary efficient format which you can query, you then have the same problems as a database. And if there is something we have learned in the history of computing, it's that databases are hard to design properly, and especially from scratch.
And especially when you want it to be immune to random failures without data loss.

The last few entries of a log file before something catastrophic happens are precisely the entries that are the most important to make sure they aren't lost.