Hacker News new | ask | show | jobs
by kuschku 4055 days ago
And the specs will be gone in 40 years. While ASCII will stick around.
2 comments

    And the specs will be gone in 40 years. While ASCII will stick around.
Why would they be gone? You realize ASCII is a 'spec' too?

If a binary format has an open specification, it's as future proof as ASCII. ASCII's durability is due to a clear and open specification that's easily implemented. Not some magic sauce that makes it instantly human readable.

That text you see? It's not what's actually in the file. That's just 1's and 0's like every other format. There's literally no difference between ASCII and any other "binary" format.

Text encodings have come and gone before, too. We don't use the Baudot code on modern computers still, and EBCDIC is confined to IBM mainframes.
Does that really matter? Log files are often unimportant when they get over a month or two old, what is it in your log files that has to be kept for 40 years?

Longevity of log files hardly seems like a reason to pick an otherwise inferior format.

It is not about reading 40 years old logs, but rather reading logs from today generated by 40 years old system.

For example, many nuclear power plant in the west were built 40 years ago. Amongst the myriad of sensors, devices in a power plant, I think that most of them are outputting ASCII logs. There are still readable today. (Same can be said about avionics, space probes, etc.)

Now imagine yourself 40 years from now on, trying to fix or reverse engineer a very legacy system, you will have to recompile a journalctl from 40 years ago before being able to read anything.

There's a good chance that you'd be reading EBCDIC logs. :)

40 years from now, you will probably be able to invoke journalctl on the system and parse the dumped output as plain text. Or call gunzip on the compressed logs, $DEITY knows if we will be still using gzip by then. And if the system does not boot, you won't be able to connect the peripherals anywhere else... :)

This is a strawman that keeps being brought up.

There's no tool out there that generates log files it cant itself read. So there's not going to be any "oh gee I have these files being generated and nothing can read them" situation.

However, there is just about near-zero system out there that generates text logs that it can itself read. Text logs are write-only for most logging systems, while all binary logs I know of are read+write.

Stepping back though this entire argument is absurd. Thinking about "whatever will those people do 40 years from now with the tools of today" is fairly braindead once you understand that the quality of the tools will affect their longevity. So if the logging system becomes an actual, factual problem over time, the tools will die off by naturally-artificial selection.

I have already worked on very basic embedded system where you only way of getting logs is connecting to the device using a serial line, and after fiddling a bit with the baud rate, you can get some readable output.

In this case, you can't really do anything from the device itself.

Arguably, this is not the use case for binary logger but I was originally addressing the "40 years old logs" argument, that do exist in the real world.

> There's no tool out there that generates log files it cant itself read.

There are plenty of tools that don't read their logs - more precisely, computer units where you don't log in, units that you don't operate on console. Embedded devices that perform some function and also keep some log, but which cannot be used for reading that log. You will need to read that log using something else. Plain text (ASCII, and now ISO Latin and UTF-8) is a fairly stable format for everything, and will be for the next 50 years.

People usually read log files because something went wrong, like a system crash, why do you assume the OS that generated the log file will be readily available?
It isn't a strawman at all. It's a real problem that exists today for folks who need to access old documents from the 80s, for example.
> There's no tool out there that generates log files it cant itself read.

If only. I've used two such systems myself.

When the tool for your binary data file says "data corrupt" when you try to open it, what do you do?