|
|
|
|
|
by wickawic
3247 days ago
|
|
Agreed. I think many people do a good job of logging when something goes wrong, and maybe they are good about logging inputs/outputs of a system, but I think that logging important decisions often falls off the table. Ideally I would like to take a request ID, grep the logs, and get an entire story of what happened to that request. In reality, this rarely happens! |
|
If that's the situation you find yourself in, I cannot praise centralized logging with a good frontend highly enough because I frequently find myself trying to figure out what happened to a request, and it's like night and day.
Needing to ssh anywhere and run grep against log files is functional if there's only one or a handful of VMs, but it gets complicated with a handful of machines, and even just SCP-ing the logs off becomes time consuming if there are a lot of machines. Then once the logs are off, 'grep' quickly becomes inadequate. (And I should know, I've built some truly horrible regexps to try and grep for dates because I didn't know any better.)
All that friction means that answering the original question; figuring out a detailed internal reason for why my customer received a 500 http status response error, is just too toilsome for all but the most (as you noted) doesn't happen in .
With centralized logging, I'm able to search for a request ID and see the logs, and this is a reality as often as I need, in order to debug complex multi-system issues.