Hacker News new | ask | show | jobs
by dwattttt 16 days ago
I'm curious, what's the value of a stack trace of another person's library functions? As mentioned, you can get a stack trace that includes all of your code, that's what was offered to you.

The only thing a library gathering a stack trace instead of you gives you is that it includes traces through code you didn't write & ostensibly aren't responsible for. If you're going to go to the effort of tracing through a dependencies code, you might as well add the stack trace yourself; it's a single line of code from the standard library to collect it, std::backtrace::Backtrace::capture().

EDIT: capture will only actually grab a trace when env vars say it should, you can use force_capture to ignore those. To get to why this isn't the default for errors you're asking for, here's a line from their documentation:

> Capturing a backtrace can be both memory intensive and slow

1 comments

Ideally (in my ideal world), it would be Result<T, E> that holds the backtrace. The value is that I don't know up front which method call is going to cause an error that is hard to track down, which is why I don't see how "instrument your calls with backtrace yourself" helps. It requires that I already have some idea about the execution path, otherwise I don't know where to put the backtrace instrumentation.

Since Backtrace::capture() is already tied to an env var, we could have the backtrace on Result without affecting performance, since you would only enable it for debugging. This would allow you to eg. easily track down a situation where you see in your prod logs that you are encountering a lot of "validation error: string is too long" but you can't tell where it is coming from. Flip the env var, redeploy the application, read the backtrace, turn off the env var, fix the problem.

> track down a situation where you see in your prod logs that you are encountering a lot of "validation error: string is too long" but you can't tell where it is coming from.

Capturing a stack trace is a hefty operation: making it happen on _every_ error creation, which would include creating an error in response to another error (like <failure to allocate> causing <failure to create object>) could easily grind a production server to a halt. Especially if there's correctly handled errors happening: every one of them will pay this cost, every time.

It sounds like a really specific problem here; the log line that's happening is generic enough that it doesn't identify which line of code is emitting the log, so you can't just add `capture` to that line (what logging system even does this? printf logging?).

I feel like we are talking past each other, because you ignored the whole part about "it is already tied to an env var, and it would be still tied to an env var" that you would only enable on demand, so who cares if it's a hefty operation? Also what about other languages that capture stacktraces all the time with exceptions, or scripting languages with type errors, where you can't even turn it off? Rust is somehow different?

It is a specific problem, so what? You see that you are sending 500 from an axum handler, and you are logging "serde deserialization error: line 4 invalid", wouldn't it be nice to see where that came from, without instrumenting all the places you are deserializing something?

Rust errors are not exceptions. Catching exceptions is unbelievably expensive in all languages that support them, compared to handling a Rust error value.

Some languages have exceptions as the only error handling mechanism (C#, Java, scripting languages), and it sounds like that's what you're used to. But this is also broadly agreed to be a severely limiting factor of those languages, resulting from being designed at a time when we didn't know better.

If you want to go fast (and Rust does), you cannot be catching exceptions in the hot path, and you certainly can't be throwing exceptions that carry stack traces, because walking the stack to build up the stack trace is many orders of magnitude slower than returning an error value.

Rust's error handling modes are designed with the benefit of hindsight from all those other languages from the last few decades, and reflects the fact that errors broadly fall in two categories: validation failures and programmer errors. The former should be a cheap error code that can be handled, the latter should terminate the program/thread/task and give you enough information to diagnose the problem.

I can't reconcile what you're asking for with the situation you're describing. If every single error everywhere in the program created a stack trace and logged it at creation time, your error would be lost under an avalanche of benign errors that are handled. And if you only want to selectively log _that_ error that's interesting, you need to selectively modify the place that logs it, which you don't want to do (because you don't want to have to find it).

It sounds like what you want is the errors you log to always log stack traces. Which is a fine position, I do something like that. It's just not something that can be the default, because it can't be done everywhere.