Hacker News new | ask | show | jobs
by randomdata 1035 days ago
> providing a core library way of wrapping with stacktraces would be a very useful next step

What eventually became the standard library error wrapping proposal evolved from the work done on the Upspin project. It did include stacktraces, and believed like you that it would be useful to have them. But analysis of the data showed that nobody ever really used them in practice and, for that reason, was removed from the final proposal.

> particularly given the most popular package doing that previously is now unmaintained.

Lacking wide appeal doesn't mean there isn't a niche need, of course. However, with the standard library accepting a standard for error wrapping, which this package you speak of has been updated to be compatible with, what further maintenance would be needed, exactly? It would be more concerning if it wasn't considered finished by now. It seems the solution for niche needs is right there.

1 comments

In the last few months I've realized what I desperately need: a way to wrap an error with a call stack at the point where it enters our code base. This would probably save me on average 20-30 minutes a week.

I see this all the time:

   main.go:141 error: could not transmogrify the thing: a144cd21c48
And then I literally grep the code base to find the error message. That works ~50% of the time, but the other 50%, I see this:

   main.go:141 error: not found
And then I have to spend 5-10 minutes spelunking to try to find where that error might have originated from.

But this would be amazing:

   main.go:141 error: not found callstack=...
This is such an infuriating problem. I'm convinced I'm using Go wrong, because I simply can't understand how this doesn't make it a toy language. Why the $expletive am I wasting 20-30 and more minutes per week of my life looking for the source of an error!?

Have you seen https://github.com/tomarrell/wrapcheck? It's a linter than does a fairly good job of warning when an error originates from an external package but hasn't been wrapped in your codebase to make it unique or stacktraced. It comes with https://github.com/golangci/golangci-lint and can even be made part of your in-editor LSP diagnostics.

But still, it's not perfect. And so I remain convinced that I'm misunderstanding something fundamental about the language because not being able to consistently find the source of an error is such an egregious failing for a programming language.

I find it interesting how, as soon as the word error shows up, people seemingly forget how to program.

Ignore the word error for a moment. Think about how you program in the general case, for a hypothetical type T. What is it that you do to to your T values to ensure that you don't have the same problem?

Now do that same thing when T is of the type error. There is nothing special about errors.

Its flaws or merits aside, when you have no other useful context to add to the error, that's precisely what Errorf is for.

  func bar() error {
    err := baz.Transmogrify()
    return fmt.Errorf("transmogrify: %w", err)
  }

  func foo() error {
    err := bar()
    return fmt.Errorf("bar: %w", err)
  }

  func main() {
    err := foo()
    fmt.Printf("foo: %v", err)
    // foo: bar: transmogrify: not found
  }
There's your callstack, without the cost of carrying around the actual callstack.
Nitpicking here, but I prefer a different convention.

  func bar() error {
    err := baz.Transmogrify()
    return fmt.Errorf("bar: %w", err)
  }

  func foo() error {
    err := bar()
    return fmt.Errorf("foo: %w", err)
  }

  func main() {
    err := foo()
    fmt.Printf(err)
    // foo: bar: transmogrify: not found
  }
Also, I tend to skip quite a lot of layers. The (only?) advantage of manual wrapping over stack traces is that a human can leave just 3 wrappings which are deemed sufficient for another human, while stack trace would contain 100 lines of crap.

    func bar() error {
        err := baz.Transmogrify()
        return fmt.Errorf("bar: %w", err)
    }
This is broken. If baz.Transmogrify() returns a nil error, bar will return a non-nil error.

Also, annotations like this, which repeat the name of the function, are backwards. The caller knows the function they called, they can include that information if they choose. Annotations should only include information which callers don't have access to, in this case that would be "transmogrify".

The correct version of this code would be something like the following.

    func main() {
        fmt.Printf("err=%v\n", foo())
    }
    
    func foo() error {
        if err := bar(); err != nil {
            return fmt.Errorf("bar: %w", err)
        }
        return nil
    }
    
    func bar() error {
        if err := baz.Transmogrify(); err != nil {
            return fmt.Errorf("transmogrify: %w", err)
        }
        return nil
    }
Indeed, our code base is littered with fmt.Errorf("...: %w", err), but that only works if enough places in the code add context. Currently only about 15% of return sites do this.

And I disagree that the cost of carrying around the callstack is something to worry about. Errors are akin to exceptions in C++/Java: no happy path should rely on errors for control flow (except io.EOF, but that won't generate a call stack). They should be rare enough that any cost below about 1ms and 10k is negligible.

Every error should be annotated at the call site. fmt.Errorf("...: %w", err) isn't litter, it should be a basic expectation of any code which passes code review.

> Errors are akin to exceptions in C++/Java: no happy path should rely on errors for control flow (except io.EOF, but that won't generate a call stack). They should be rare enough that any cost below about 1ms and 10k is negligible.

This may be true in C++ or Java, but in Go, it is absolutely not the case.

Errors are essential to, and actually the primary driver of, control flow!

Any method or function which is not guaranteed to succeed by the language specification should, generally, return an error. Code which calls such a method or function must always receive and evaluate the returned error.

Happy paths always involve the evaluation and processing of errors received from called methods/functions! Errors are normal, not exceptional.

(Understanding errors as normal rather than exceptional is one of the major things that distinguish junior vs. senior engineers.)

I think there's less daylight between us than it seems.

> Errors are normal, not exceptional.

The _handling_ of errors is normal. Code that doesn't consider errors is not production code.

And granted, in Go, control flow is driven by errors more often than in C++ or Java. Sentinel error values are common. See for example all usage of error.Is, checking for io.EOF, packages that define ErrSituationA and ErrSituationB, etc.

But my argument was about errors that can't be dealt with locally, where the origination and ultimate handling are very far apart. A given flow will encounter these errors relatively rarely compared to the happy path (and if it's not rare, you probably need to fix or change something). Having an intuition about this is important for predicting your code's performance. For example:

- The SQL call failed because the network connection dropped; client gets 500 or 502, or retry.

- A call to an external service failed because the network was bad; it gets retried.

- The SQL call succeeded, but the record the client asked for wasn't found, so the client gets a 404.

- Writing to a temporary file failed because the disk is full, so some batch job fails with an error.

Apart from potential concerns about DoS, worrying too much about the performance of error handling in these relatively rare cases is absolutely premature optimization.

DoS isn't even a concern. I just benchmarked capturing a call stack in Go, and it's on the order of a few microseconds. Unless you're in performance critical code (and you're benchmarking, right?), it's fine.

> But my argument was about errors that can't be dealt with locally, where the origination and ultimate handling are very far apart. A given flow will encounter these errors relatively rarely compared to the happy path (and if it's not rare, you probably need to fix or change something). Having an intuition about this is important for predicting your code's performance.

When code encounters an error, it can either deal with that error programmatically, or return that error to its caller. I don't think you can make any generalized assertions about whether one or the other of these cases is more common, and I'm confident that you can't assert that one or the other of these cases is better or worse than the other, or that one of them represents a problem worth fixing.

Errors potentially occur at every fallible expression. Where an error is handled is orthogonal, and generally unknowable, to the given bit of code that receives that error.

I agree with you that "the performance of error handling" should never be a first-order concern when writing code.

I don't agree with you that capturing a call stack is fast enough to ignore. Calling runtime.Callers (https://pkg.go.dev/runtime#Callers) takes time proportional to the size of the pc []uintptr slice, and can easily get to O(ms) or beyond. It's fine if a given bit of code opts in to this cost, but it's not something that you should do by default; the threshold for performance critical code is O(ns), not O(us).

> worrying too much about the performance of error handling in these relatively rare cases is absolutely premature optimization.

It's not something to worry about, but it's also a premature optimization to include when there is no need. The Go team considered adding stack traces as described before 1.13, postulating that it would be useful, but measurement determined that they were rarely used in the real world.

If your measurements (you are measuring, right?) that pertain to your specific situation tells a different story, they aren't something to be afraid of, but would be silly to make the default for everyone. The standard tools don't need to serve every single use case ever imagined.

The reality is, unless you forget how to program every time you see the word error (which seems to be a thing), in the real world you are never going to just `return err` up, up, up the stack anyway. Even ignoring traceability concerns, that is going to introduce horrible coupling. You wouldn't do that for any arbitrary type T, so why would you for type error? There is nothing special about errors.

> Any method or function which is not guaranteed to succeed by the language specification should, generally, return an error.

Most Go programmers are too scared to panic and abort when invariants are violated. I think most codebases contain at least 2x as much error handling as is really necessary.

Nope.

Panic isn't an ersatz error reporting mechanism, it's a tool of absolute last resort. Any function or method that can fail should return an error, and should signal failure via that error. Callers that invoke any fallible function or method should always receive, inspect, and respond to the returned error.

> that only works if enough places in the code add context.

It would be a bit odd to not add context, wouldn't it? Same goes for any value. This is not exclusive to errors. If you consider a function which returns T, the T value could equally be hard to trace back if you find you need to determine its call site and someone blindly returned it up the stack. There is nothing special about errors.

While ideally you are returning more context than Errorf allows, indeed, it is a good last resort. If your codebase is littered with blind returns, the good news is that it shouldn't be too hard to create a static analyzer which finds blind returns of the error type and injects the Errorf pattern.

Are you suggesting it's OK if ParseInt failures take 1ms? Or should ParseInt use a different "kind of error" that's not commensurate with the regular error kind?

Do you think most errors look more like ParseInt, or more like sql.Open where 1ms might be acceptable? (Do you think a call stack from the insides of sql.Open would be useful? My experience, mostly not...)

So the stacks should probably only be for "complex errors", and only for frames that happen in code you (hand waving) "care about". Maybe your programs just have far too complex internal error handling?

See my response to a sibling. I wasn't clear; I was implicitly differentiating between these:

1. errors that can be handled locally (such as parsing; in other languages, these situations are often signaled with return values instead of exceptions)

2. errors that can't be handled locally (such as network errors; other languages use exceptions for these)

My argument was that worrying too much about error handling performance in #2 is premature optimization. 1ms is extreme, but the actual figure of capturing a call stack in Go -- several microseconds, by my benchmark -- puts it squarely in the "don't worry about it unless your code is performance-critical" category.

An error is an error. The immediate caller is always responsible for detecting and handling errors in whatever way is appropriate for their calling context.
Not exactly -- you should only fmt.Errorf wrap errors which are non-nil.

See my sibling comment: https://news.ycombinator.com/item?id=37234455

Agreed. In fact I wrote a (very) small library to help deal with it.

https://github.com/kitd/chock