Hacker News new | ask | show | jobs
by brigadier132 641 days ago
It's far from perfect.

One of the biggest problems with Rust error handling is that if you want to have explicit error return types encoded in the type system you need to create ad hoc enums for each method that returns an error. If you only use a single error type for all functions you will inevitably have functions returning Results that contain error variants that the function will never actually return but still need to be handled.

Without enum variants being explicit types and the ability to easily create anonymous enums using union and intersection operators Rust enums require a ton of boilerplate.

3 comments

I'm just now learning Rust, as a long time C++'er, and this was the first part of my Rust journey where I thought to myself, "Boy, this really smells--this couldn't possibly be the idiomatic Rust Way to handle functions that can produce different types of errors. I must be doing something wrong!"

For example, I have a function that takes an array of bytes, decodes it as UTF-8 to text, parses that text into an i32, and checks the int that it is within a valid range. This is not a big function. But it might produce one of: 1. str::Utf8Error, 2. num::ParseIntError, or 3. MyCustomInBoundsError. There's no clean way to write a Rust function that could return either of them. I had to bundle everything up into an enum and then return that, and then the caller has to do some "match" acrobatics to handle each error differently.

I hate to say this, but I miss Python and C++'s exceptions. How nice to just try: something and then:

    except SomeError:
        doFoo()
    except ThatErrror:
        doBar()
    except AnotherError:
        doBaz()
    finally:
        sayGoodbye()
An elegant weapon for a more civilized age.

What do I know though? I'm still in the larval stage of Rust learning where I'm randomly adding &, *, .deref() and .clone() just to try to get the compiler to accept my code.

You can still do that in rust if you want / need to:

https://play.rust-lang.org/?version=stable&mode=debug&editio...

    fn main() {
        match process(&[0x34, 0x32]) {
            Ok(n) => println!("{n} is the meaning of life"),
            Err(e) => {
                if e.is::<std::str::Utf8Error>() {
                    eprintln!("Failed to decode: {e}");
                } else if e.is::<std::num::ParseIntError>() {
                    eprintln!("Failed to parse: {e}");
                } else {
                    eprintln!("{e}");
                }
            }
        }
    }
    
    fn process(bytes: &[u8]) -> Result<i32, Box<dyn std::error::Error>> {
        let s = std::str::from_utf8(bytes)?;
        let n = s.parse()?;
        if n > 10 {
            return Err(format!("{n} is out of bounds").into())
        }
        Ok(n)
    }
In library code though that would make it generally more difficult to use the library, so the enum approach is more idiomatic. Then that comes out as

    match(e) {
        MyError::Decode(e) => { ... }
        MyError::ParseInt(e) => { ... }
        ...
    }
etc, which is isomorphic to the style you miss. What you're perhaps missing the is that `except ...` is the just a language keyword to match on types, but that Rust prefers to encode type information as values, so that keyword just isn't needed.

I feel you on the larval stage. Once you get past that, Rust starts to make a lot of sense.

Many error types implement std::error::Error, maybe using that would make things easier.

An example: https://play.rust-lang.org/?version=stable&mode=debug&editio...

Hmm, I think I tried this a few times, but I could never get the right magical combination of dyn, Box<> and & to get it to compile.
This is true, though the issue of having error variants the function can't return is fairly overblown, because most code doesn't bother handling all error variants individually. Most of the time an error is either propagated upwards (possibly wrapped in another error) or logged. Inspecting the error variants is usually only done if you want to specially handle some of them (such as handling `ErrorKind::NotFound` when deleting a file), rather than exhaustively handling all variants.
We can disagree on it being overblown or not but I think it would be enormously useful to be able to look at a function signature and know exactly how it can fail.
They tried that in Java and no one uses it...
Lambdas broke checked exceptions. You can't declare that you throw whatever a generic lambda might throw, so they quickly devolved to "I throw nothing (only unchecked)." The "I throw everything" alternative is rarely used because it spreads virally through every caller.
Checked exceptions were strongly discouraged because they have nonlocal behavior when changing library code. If you want to rethrow exceptions you can’t handle you have to update all callers when the callee changes the throw signature. Lambdas are orthogonal.
I'm thinking of cases like Stream#flatMap where I might be prepared for what a lambda (probably a method ref) could throw, yet still can't use it because exception lists for the Stream interface methods had to be declared statically years ago.
Checked exceptions were vilified long before Java gained lambdas.
While Java usually gets the blame, it was following a precedent set by CLU, Modula-3 and C++.

Also I dearly miss them in .NET, every single time something breaks in production, because some developer didn't bother to catch exceptions, when they should have.

The ways in which a function can fail is typically the job of the documentation.
Why bother having a typed return value? That could be in the documentation too. The whole point of a type system is to help me understand what the function can and cannot do without needing to make guesses based on the documentation. It's not fatal, but it is annoying and inconsistent that Rust can do this on the happy path but not the error path.
The type system cannot capture 100% of the semantics of the function. You put what you can in there, but you also need documentation. You could provide a bespoke error type for every single function that returns an error, but that's a ton of boilerplate, and you're effectively just moving the documentation from the function to the error type (enum variant names are not sufficiently descriptive to avoid having to write documentation).

Even in cases where the error does already precisely match the semantics of the function, you still need documentation. std::sync::Mutex::lock() returns a `Result<MutexGuard<T>, PoisonError<MutexGuard<T>>>`. What's a PoisonError? That's a precise error type for the semantics of this function, but you need the documentation to tell you what it actually means for a mutex to be poisoned.

You cannot get away from having documentation. And you're free to make custom error types for all your functions if you want to, it just doesn't really get you much benefit over having a single unified error type for your module in most cases. If you have a reason to believe the caller actually does want to handle error variants differently then sure, make a new error type with just the variants that apply to this function, there's plenty of precedent for that (e.g. tokio::sync::mpsc::Sender has distinct error types for send() and try_send(), because that's a case where the caller actually may care about the precise reason), but most error values in Rust just end up getting propagated upwards and then eventually logged.

> You could provide a bespoke error type for every single function that returns an error, but that's a ton of boilerplate

If we had more typescript-like discriminated union semantics a lot of the boilerplate would go away. Throw in automatic implementation of From traits for enums composed of other enums / types and it could be pretty close to perfect.

Yeah. The error type requires so much boilerplate that I honestly thought I was stupid and doing something wrong. But nope. Just horrific amounts of boilerplate.

Then people who don’t want to engage with ThisError and Anyhow do bullshit hacks like making everything a string that has to be parsed (and don’t provide functions to parse).

I get why it is that way, but it feels icky.

> I get why it is that way, but it feels icky.

There is no reason it has to be so boilerplate heavy, a lot of it can be fixed but someone has to put in the work. The only technical reason that could hold it back is compile times.

Zig has automatic error unions. No boilerplate at all, but not just a single "error" type. The only downside I see in zig errors is that they can't hold extra data.
It's a massive downside. When I was using the JSON parser I found it very annoying that it could only tell me the input JSON was invalid, not where in the input the problem was.
Yeah I agree. I think they tend to go for a mutable parameter reference to keep track of that stuff, which is definitely C-like but kinda unwieldy.
Of course there is a reason it is this way. In lower level languages, compilers need to know the type and size of the type at compile time. This holds true even for languages with looser typing like C.

You are not going to get strict types in a low level language and also get ergonomic errors. This is fundamentally not how compilers works.