Hacker News new | ask | show | jobs
by felixgallo 4054 days ago
You missed the joke at the end where he correctly pointed out that Windows' logging is a total joke, and that discovering information from Windows logs is essentially impossible unless the tool writer specifically predicted your use case.

And that's the nub of it: text logs are for when you may have many varied, complex reader use-cases, and you don't understand all those cases well enough yet to lock them down forever, and you have a thousand excellent tools at your disposal that you would like to be able to continue to use.

Recent log spelunking for me included 'cat log.? | grep fail | sed 's/^.worker_id$//g' | awk '{ print $5, $4 }' | sort -n -r | sed 30q'.

There's no analogue in any binary logging system I've ever found.

2 comments

It seems to me that a simple transitional tool for a binary logging system would be for the implementer of the binary logging system to also include a tool that consumed a binary log file on stdin and produced a stream on stdout in one (or more, selecting which by command line arguments) common text log formats.

That lets you develop an ecosystem of supporting tools that take advantage of any strengths of the binary format, while still allowing the freedom of using the (initially, at least, probably far more capable) set of tools available for the text formats.

what is the point of such a 'transition' if there never arrives any point at which there is net added value to a binary format?
If there is some (not initially necessarily net for all users -- benefit being, after all, something that varies from user to users, but significant for some subset of users) benefit, the point is to mitigate the cost of moving out of a native text format, and increase the number of users for whom there is initially a net benefit, which also increases the initial use of the binary format and the effort likely devoted to building auxiliary tools which leverage it to some advantage, increasing the speed at which the net benefit of the format for a wider range of users is increased.

This may or may not ever make it a net benefit for every user, but that's okay. There's a whole lot of space between "this technology is the best choice for everyone" and "this technology is the best choice for no one".

This isn't really true as the Windows event logs contain text as well as the other structured data, which you can search for using tools on the system. For example to search for some specific text in the system log using Powershell:

    Get-EventLog -LogName System | Where {$_.Message -Match "something"}
To process text as fields, as with awk, one would use the Split method (at least to start off with):

    Get-EventLog -Log System | Where {$_.Message -Match "something"} | %{ $_.Message.Split()[5,4] }
But as message text is often parameterised, it may be easier to take advantage of this data to get what you need. For instance, this command would extract the latest machine sleep and wake times from the system log, and calculate the duration:

    Get-EventLog -Log System -Source Microsoft-Windows-Power-Troubleshooter -InstanceID 1 | Select-Object @{n="SleepTime";e={$_.ReplacementStrings[0]}}, @{n="WakeTime";e={$_.ReplacementStrings[1]}}, @{n="SleepDuration";e={([DateTime]$_.ReplacementStrings[1])-([DateTime]$_.ReplacementStrings[0])}}
One can also sort and get unique values, just as in Unix-type systems - this command lists all drives defragmented in the past 30 days:

   Get-EventLog -Log Application -Source Microsoft-Windows-Defrag -InstanceID 258 -After (Get-Date).AddDays(-30) | Select @{n="Drive";e={$_.ReplacementStrings[1]}} | Sort Drive | Unique -AsString
So all the same capabilities are there, and then some. You just need to know your tools well enough to take advantage of it.
Most binary formats contain text; that isn't what distinguishes them from text formats.

One of the objections though is that with binary formats you're limited to the capabilities of the tools that have been built to handle that particular format, which you're illustrating nicely. In a binary format world, I would have to know the capabilities and limitation of dozens, maybe hundreds of different tools for extracting useful information from logs, instead of the small handful of tools I use to do the same job now, which can be applied to any log file formatted as plain text.

And that's assuming that all these other tools will be as powerful as Powershell, which isn't a bet I'd want to make.

madhouse has some fair points about the limitations of text logs, but "everything should be stored in binary formats" is a not a great idea. Actually, "a terrifying new hell" is probably closer to how I feel about it.

In the case of wanting to stick to a text-only workflow, rather than taking advantage of the structured data features, then you only need a tool that converts the binary log format to your preferred text format. Which isn't too arduous. In systemd that would be journalctl, in Windows anything that can use the event log API such as Powershell or many other utilities.

The examples I posted above were just to show the equivalent capabilities in Powershell but really it's all flexible enough to use whatever you like.