|
|
|
|
|
by tedsuo
3341 days ago
|
|
The most common problem is that code executing at that point can hang in unexpected ways, preventing shutdown. It's a real bummer when that happens. I've even seen it happen with logging written the wrong way, where the code attempts to flush the logs to ensure they are written... and hangs. Meanwhile, the crazy code that panicked is still running it's other goroutines - remember, you called recover! - so maybe now the webserver still has an open port and is allowing your users to access whatever strange state is left inside... gonzo things really can happen if you let a corrupted program stay on rather than shutting it down immediately. Even if nothing bad is happening, you still are out of commission for that entire period. The point is, murphy's law always comes into play. So if we're talking about production best practices, consider that "most likely fine" means "definitely not fine at scale over time". Just make sure that whatever you're doing during shutdown can't block. |
|
I'm seriously wondering if you ran into any trouble with recovering panics in production, because that would imply all Java, C#, Python, JS and Ruby server code in production which is happily catching and logging exceptions in the main request handler is constantly running into corrupted state.