Hacker News new | ask | show | jobs
by the_mitsuhiko 4263 days ago
> The main case against binary logs is their corruptibility. This happens more often than you’d think, due to not having any transaction consistency, as an RDBMS would. The advice of the systemd developers on handling this? Ignore it.

To be honest, if the format is done well there is no reason you cannot still read all the non corrupted records. textfiles corrupt too, you just generally live with them being broken because you can read the unbroken stuff.

Going by the bug tracker the journald approach is to make the reader work with broken files better.

2 comments

The binary logs are append only. The bug tracker entry says that they got "some corrupted logs" but not how. People get corrupted or truncated text logs all the time, they just ignore it because "oh the log ends midway through a line? eh".

The ridiculous "oh my god my logs will be always ruined thing" is a beat up by those who need to assassinate something about systemd, because it's always been devoid of any technical accuracy or discussion.

A truncated text log can still be informative, and I've never had a corrupted text log hang or crash my text editor.
That's because your text editor has had thousands of hours of work going into catching all of the edge cases.

Things which have caused editors to hang or crash: 1. Binary data outside of the US-ASCII visible range 2. Malformed or creative Unicode combinations 3. Very long lines 4. Very many lines 5. Inconsistent line termination 6. Lines with patterns which trigger syntax highlighting, URL underlining, etc. 7. Embedded terminal escape sequences

It's not unreasonable to imagine that the journald developers might spend the same amount of time on safeguards, something which is easier with a well-defined binary format.

Yeah so let's just redo all those thousands of hours of work... for one specific format that has literally one use.
It's not thousands of hours with a better-defined format than free-form text, which is obviously the case for journald. Doing that for a single format makes sense for something which would be as heavily used as system logs – it's not like people haven't spend crazy amounts of time writing syslog parsers or other hassles which go away with a structured format.
This is like saying.

"I get shot going to work every day. But I bought a bullet proof vest, so life is okay."

Iron bandages don't solve problems, they for a short time work around them. Except because nobody actually like writing systems level code, they stick- forever.

> "I get shot going to work every day. But I bought a bullet proof vest, so life is okay."

You get corruption if the machine shuts down incorrectly. In binary or text files. In that case journald/syslogd act like a black box. Syslogd will give you whatever garbage it has in those text files, journald will give you the surviving records and will tell you which ones are unreadable.

If you want to not get shot, shut down your machine properly. If you cannot shut down your machine properly because it crashed then you have data garbage if you want it or not. Drives work that way.

I'm not sure what you are arguing for.

Journald does not write broken records itself.