Hacker News new | ask | show | jobs
by vlovich123 587 days ago
Should be cheaper but the expensive part not discussed has to be snapshotting the stack which feels expensive and that’s what the panic information is supposed to preserve. That’s why they got an extra 5x performance improvement but not 10 or 100x and didn’t provide a benchmark of how frequently you could snapshot the stack. Indeed, by using a simple microbenchmark we don’t see the measurement of this improvement when the stacks are ~20x frames deep - do the same hotspots show up or does capturing the stack start to dominate?

And one thing that could be needed is the ability to throw within a catch and if you do that you can corrupt the TLS (ie memory safety) unless you’re careful and follow the guidelines. In other words you personally can have written 100% safe code that is not memory sound unless you follow the high level rules - this is closer to a C API than anything that would be “allowed” as a traditional rust api where the guarantee of a safe API is that no unsoundness can happen no matter how you hold it. That’s a lot of safety to sacrifice for something tried and true. Use it if you really need it but I think following the advice that error states should be rare in the first place is probably better - return an error for any failable operation and panic on unwind. Trying to catch unwind panics is a landmine approach of trying to get things to work and I know from experience having tried that approach. It doesn’t play with things like async too. And then you have to bubble them up across threads?

This approach would fail there. These aren’t unfixable design flaws thankfully. You’d need a sum type to have the underlying memory to be detachable to the heap and somehow guarantee it’s always detached safely and soundly before overwriting (eg having a counter in the TLS header that is copied to the struct being unwound to guarantee that the TLS values you think you are accessing indeed has not been overwritten or having a TLS pointer to the stack value containing the unwound value somehow be written through to detach whenever someone doesn’t call the right catch mechanism). So I think this work is super valuable and the ideas should be refined and mainlined because inefficiencies like this aren’t great but simultaneously no one should be writing error handling by catching unwraps except for very very limited situations that you can clearly articulate as necessary for the goal you are trying to achieve. Like I spawned a background thread but if the computation fails I can report the failure gracefully to the human operator of the machine in a non debug context. But in those cases you want to be a supervisor forked process that is responsible for process death only rather than compiling it all into one binary. I wish Rust made that part easier. Ie start the process in a different mode but then switch to panic so that you carry the performance gains (ie this crate should be built using optimized panic with unwind but then this other crate is with a different unwind mode and you could spawn the other crate through a guaranteed fork to fix the soundness potential and the panic information is serialized across the wire to the patent process via a private pipe and unwound that way). That would provide an easier API to indicate more clearly the delegation of responsibility you should have been catching unwind and how to structure your code operationally. However it can’t be the only way because you might have something like an http framework. And you want to “guarantee” that you deliver an HTTP response to a socket and log metrics before crashing and you want the next request to be handled immediately with minimal additional CPU work. You can’t just do it across a fork barrier because that’s an expensive thing to dispatch to a new thread in the happy path + you have a thread poll you need to keep healthy and alive to maintain in tokio - can’t fork or spawn a thread on every new inbound connection. So there are cases where you want to catch unwind which is to have consistent behavior in a framework even if the user’s code or our code has a bug (but in those cases you might probably use the panic method during debug builds to notice failures like that before your release to production where you prevent bugs from manifesting as a mechanism/gadget for attackers to DOS your service)