Hacker News new | ask | show | jobs
by rmanolis 561 days ago
to understand what I said, you have to solve the Error Handling Challenge, in your favorite programming language https://rm4n0s.github.io/posts/3-error-handling-challenge/

When you try to implement it, then everything on what I said will make sense to you.

Also, we don't need log systems when there is a programming language like Odin to parse stack traces with type checking (not just string like you gave me as an example from Java).

In microservices you are the error handlers. For example, if in the logs you see a stack trace, then you will go to the code and fix the error.

In Odin I don't need to go in that trouble, because ALL the stack traces can be handled. There will be no undefined behavior in software or unexpected input that caused an unexpected stack trace, so there is no reason to have logs.

1 comments

I looked at that and I find it pretty funny. As in, why would I ever build error handling that cares about the specific call stack to handle the error? That makes no sense to me. I'll show you why.

Let's say you have the call stack as this (from your example):

    f4()->
      f3()->
        f1()->
          ErrInvestmentLost
Great, I'll handle this based on the fact that f1 was called by f3 (in the Java example, you'll just inspect e.getCause() until you reach the desired point in the trace - basically do what `printStackTrace()` but don't print it and instead do your error handling based on it).

But nobody would ever want to do this because it's super brittle. I change f4 so that it first calls f17, which then calls f3, which then calls f17 again, which calls f1 and your error handling based on the call path is suddenly broken.

What is it that you are trying to even achieve by doing this? Proper error handling doesn't depend on the call path. Proper error handling depends on the type of error that occurred and whether you can actually handle it at all or if you just have to give up and throw the error all the way to where it will get logged for a programmer to take a look at why it happened and why we couldn't handle it.

Your claim about being able to handle "all stack traces" makes no sense to me. You don't handle stack traces. You handle error types.

A real world example of the above (taking Java as an example again) might be a REST resource. My error handling should not depend on nor suddenly break, just because someone configured a new filter in the filter chain that sits above the actual resource method. Say someone added in a `AuthenticationFilter` that checks if some auth token is present and valid and that didn't used to be the case. Now any error handling in my resource method that was based on the exact stack trace combination that existed before that filter was added will break horribly.

Odin can make stack traces as error types.

Your system with Java will break if someone else add an AuthenticationFilter, but my system in Odin will not even compile until I have handled all the stack trace paths that include AuthenticationFilter.

Do you see the differences between handling stack traces with union types rather than string?

I don't see how it improves or gives me anything, no.

See, the `AuthenticationFilter` sits outside of my REST resource. I could not care less that someone configured it and that at runtime, based on some configuration that can change without even needing a recompilation, this filter will either be there on the stack or it won't.

My resource does not interact with this filter in any way and when an error happens somewhere down in another method I call, then I don't care that I was called with or without the auth token having been checked by said filter. I might care whether the method I called threw a `SQLException` or a `JSONParseException` but very probably I don't even care about that at all because I can't do anything specific in either case and will just throw it further (i.e. not handle it, other than potentially logging it).

Java actually tried the whole "specify all error situations with checked exceptions and otherwise the code won't even compile" and it failed miserably and you are hard pressed to see anything new derive from `Exception`. Everyone uses `RuntimeException`. It does come at a cost, because now I no longer have the hassle of explicitly knowing and deciding what to do with these exceptions and I may only figure out that a particular type of error can happen once I "see it in the wild" (e.g. in my logs, coz something failed) or I'm lucky enough to have actually read the documentation and handled all the exceptions I wanted to handle.

But that happened precisely because it was just too much to have all your code specify these exceptions when everyone figured out that 99% of all code just threw them further up the stack. You call one new library method that specifies an exception and you suddenly have to adjust 127 other files and the only thing you do is to declare all those methods will also just throw the exception further up the stack.

Odin's way does not mean that you have to do something for every Exception that is thrown to you. Also, in Odin SQLException can not exist, it is too generic, but SQLClosedConnException which is more specific can give a different story to a stack trace, which you can handle.

For example, in Authentication_Filter_Error union, you will have another union called SQL_Verify_Account_Error, that it will contain SQL_Error enum with the Closed_Conn value.

Imagine your stack trace like this Authentication_Filter_Error -> SQL_Verify_Account_Error -> SQL_Error.Closed_Conn

Now when you know that can happen (through CDD), you can create a switch statement to catch the specific stack trace, to call the system administrator in the middle of the night to check what happens.

This is how software should handle its errors and there is not even a need to log it.

In your scenario, you wake up, you go to work, everyone is screaming at the office, you check the logs, you see the problem, and then you call the system administrator for the problem.

I agree with half of that. The half that says that most exceptions thrown from libraries today and in much of application code as well are way too generic and hide the details that might allow handling them in a `message` String.

However, that's still about the exceptions thrown from down thread, not from the call path part of the "stack trace".

I.e. your situation would never happen.

    Authentication_Filter_Error -> SQL_Verify_Account_Error -> SQL_Error.Closed_Conn
This stack / call path is impossible, because when the AuthenticationFilter notices that the token is invalid, it returns a 401 or 403 or whatever is appropriate and my REST resource is never actually called. There's no SQL being run and very definitely no "connection closed" error occurred.

But let's say there was a distinction made with proper exception types and instead of `SQLException("Connection closed")` and `SQLException("Statement timeout")`, I actually received `SQLException(ConnectionClosedException())` vs. `SQLStatementTimeoutException`. Now, without string parsing, I can know that either the connection just closed or that the statement was aborted due to timeout. If these are checked exceptions, I have to declare that I'm aware these can happen and what I want to do with them: Handle or rethrow.

However, a myriad of such exceptions can happen. I would probably have to declare 20-50 exceptions way up in a REST resource layer. Not only can these two happen, but many other situations on the network or database side and on the JSON parsing side for the payload I receive, some exceptions from my business logic etc.

And for most of these, what can I do? If the connection to the database closed, all I can do is to log the error and return a `500 Internal Server Error` to my caller. Guess what I can do when a statement timeout occurs? I log the error and return a `500 Internal Server Error` to my caller. For a statement timeout I can't even return a `400 Bad Request`, because it's not knowable if the statement timeout occurred because the database was simply overloaded in that moment or if the request itself was created with such parameters as to always cause a statement timeout. Until we see the logs and through investigation figure out that it wasn't a bad request after all anyway. We were missing an index and the table finally grew large enough for that to matter.

So yeah, I'm good with `RuntimeException` and handling only very few specific ones ever.

Also nobody will be screaming when the token is invalid and I definitely don't call any system administrator. That's something you as a developer look into. Same with the statement timeout.

No, you don't need to just log and return 500, you can make the software to handle these kind of errors.

You could make the software call the system administrator and return a message to the user "Try again in an hour, the system administrator is fixing it now"

Or if it is a timeout, the software will call amazon to buy a new machine to scale the database and send a message to the user "Try again in an hour until we scale the system".

Developer's job is to automate error handlers and not be the error handlers.

If you can see the stack trace tree, then you can plan far ahead, but to do that we need to destroy this "agile" mindset, that is always in a hurry and doesn't let you to think that far ahead.