Hacker News new | ask | show | jobs
by fragmede 3242 days ago
> take a request ID, grep the logs, and get an entire story of what happened to that request. In reality, this rarely happens!

If that's the situation you find yourself in, I cannot praise centralized logging with a good frontend highly enough because I frequently find myself trying to figure out what happened to a request, and it's like night and day.

Needing to ssh anywhere and run grep against log files is functional if there's only one or a handful of VMs, but it gets complicated with a handful of machines, and even just SCP-ing the logs off becomes time consuming if there are a lot of machines. Then once the logs are off, 'grep' quickly becomes inadequate. (And I should know, I've built some truly horrible regexps to try and grep for dates because I didn't know any better.)

All that friction means that answering the original question; figuring out a detailed internal reason for why my customer received a 500 http status response error, is just too toilsome for all but the most (as you noted) doesn't happen in .

With centralized logging, I'm able to search for a request ID and see the logs, and this is a reality as often as I need, in order to debug complex multi-system issues.

2 comments

Word of advice: the 'jq' tool for handling JSON files (couple with a glob like '*.log' or something fancier with xargs or parallel) will absolutely save your bacon in those situations. It's way more powerful than it appears on the surface.

We had a series of Docker json-file driver log files. It's done as a raw list (no array around it) of JSON objects -- which is a bit annoying to sort and filter based on properties of the objects.

'jq '[inputs]' (asterisk).log > combined.json' was my favourite command today; it combines all the files inputs and wraps them in an array correctly. No awk needed!

Combine that with its cute:

jq '.someProp as $var | test("some search"; "gi") as $r | if $r then ($var + $__loc__) else null end' (asterisk).log | grep -v "^null$" > filtered.json

And you're away to the races. Can then load the file directly in and group_by(.somePath) and it will all magically work!

Edit: had to remove the actual asterix symbols as they screw with formatting but are used for globbing the file names. Replace with the real character

True but even with a centralized logging system, if the logs are not good enough you can find yourself still wondering what the hell happened. Grep here is just the tool to extract the "story".