Some are of the opinion that that should be handled a layer up, such as a container restart, because the program could be left in a broken state which can only be fixed by resetting the entire state.
Given that you can’t recover from panics on other goroutines, and Go makes it extremely easy to spawn over goroutines, often times it’s not even an opinion, you have to handle it a layer up. There’s no catchall for panics.
This is a major pain in the ass. I was trying to solve the problem of how do you emit a metric when a golang service panics, the issue is that there is no way to recover panics from all goroutines so the only way to do that reliably is to write a wrapper around the ‘go’ statement which recovers panics and reports them to the metrics system. You then have to change every single ‘go’ call in all of your code to use this wrapper.
What I really want is either a way to recover panics from any goroutine, or be able to install a hook in the runtime which is executed when an unhandled panics occurs.
You can kind of fudge this by having the orchestration layer look at the exit code of the golang process and see if it was exit code 2 which is usually a panic, but I noticed that sometimes the panic stack trace doesn’t make it to the processes log files, most likely due to some weird buffering in stdout/stderr which causes you to lose the trace forever.