Hacker News new | ask | show | jobs
by enneff 1305 days ago
The question: what is the state of your server after a handler panics? The answer: you have no idea. It is not wise to continue serving requests when you may have serious issues in the state of your server. Maybe some central data structure is now corrupt. You have no way of knowing. Fail fast and fail hard.

OTOH, maybe your priorities are different and you would prefer to be more available than correct. In that case by all means add a recover to the top level of your request handlers. But it was a mistake to have made this decision for all users of net/http ahead of time.

1 comments

In my experience, the panic is most likely because someone accessed a nil field when adapting some data. Nothing is corrupt, we just threw an exception in a mundane way.

The reality is this is far more common than something truly fatal. Mistake for someone or not, it probably is correct for most use cases

Yes, why they made that decision back then. However in the end the lib cannot and shouldn't know, and this decision needs to be intentionally done by the middleware. Correct for most use cases is not good enough here, should be always correct for such fundamental thing.
You're assuming that the code was written exception safely. E.g.

    mu.Lock()
    foo := bar[baz]    // <- throws exception / panics
    mu.Unlock()
Go is sold as a language without exceptions, so people don't write exception-safe code. Which is fine, except when exceptions are actually caught.
That wouldn't pass a code review where I work... Use a defer to do the unlock
I would also not allow it. I'm saying the problem is that core Go developers say "Go doesn't have exceptions", which is manifestly false, but causes people to not write exception safe code.

But despite you and me, I'm saying there's a lot of broken code out there because of this doesn't-but-actually-does misinformation.

And it's very annoying that you have to tell people to do:

    var i int
    func() {
      mu.Lock()
      defer mu.Unlock()
      i = foo[bar]
    }()
Clean code, that is not. (even if you simplify it by having the lambda return the int)
This is like the biggest thing scaring me away from Go. This half-assed "we don't use exceptions, so you shouldn't have to care about it, except when we do, so you still must write proper defers, which are now doubly verbose because nobody considered it a primary use case"... In any other language, a mutex outside a using/with/try-with/RAII would be instantly flagged in code-review or linter tools. In many cases even hard to write incorrectly, due to entering context being only way to acquire the lock.

Now this middle ground leaves you having to write triple verbose if err != null on every third line of your code and still not be safe from panics-that-shouldnt-have-been-panics.

As parent says, the only way panics can ever work is if the top-level never catches and recovers from them. I'm no expert in go but that would mean in such perfect world, defer should hardly ever be needed at all, not even for locks? Only for truly external resources? But now with popular web servers doing such recovery, the entire ecosystem got polluted and all need to handle it?

Does defer get called on panic? I thought panic cancels all further execution
Yes it does, which is why recovering from a panic can be done in a deferred function. The go runtime maintains enough metadata to track what deferred functions need to be run while unwinding the stack.
The main issues that usually arise as a result of catching panics in handlers like HTTP is that unless code it written very deliberately to be able to recover from being interrupted in the middle of any function call, there is a high risk of e.g. mutexes left locked, which in turn leads to a (silent) program deadlock in general.

This has happened during my couple years at Google at least once, even though it wasn't in an HTTP handler, but the issue was very similar.

> most likely

> probably is correct

Yeah I don't know...

But that implies your request checking is off; if your server expects a field to be filled in and a client doesn't send it, that should be a HTTP 400 Bad Request error, not an internal server error.

C/C++ based servers running into an error like that would possibly be open for attacks. Go will be a bit more resilient, but it's still better to avoid situations like that.

That said, logging the error and going on serving responses is fine I think (pragmatic), as long as the error is analyzed. But an error that doesn't trigger immediate action is a warning, and warnings are noise [0].

[0] https://dave.cheney.net/2015/11/05/lets-talk-about-logging#:...

> In my experience, the panic is most likely because someone accessed a nil field when adapting some data. Nothing is corrupt, we just threw an exception in a mundane way.

Even more so if a database is involved (which is generally the case), because odds are the transaction just gets rolled back and there's basically nothing that could be corrupted.