| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by malcolmstill 1203 days ago

So usually you don't have to specify the error type. The Zig compiler works out what errors are returned from the function by looking at any errors returned directly or errors returned from other functions called within the body of the function. That set of errors forms an enum that is the actual error type, you just don't have to write that out explicitly. An example might be:

    fn someFunction(a: usize) !usize {
        if (a < 10) return error.LessThanTen;
    
        const b = try anotherFunction(a);
    
        return 2 * b;
    }
    
    fn anotherFunction(x: usize) !usize {
        if (x < 20) return error.LessThanTwenty;
    
        return x * 3;
    }

The compiler infers the error type of `someFunction` as:

    error {
        LessThanTen,
        LessThanTwenty
    }

When you then `switch` on an error type, the compiler will exhaustively check that you have handled all the cases.

Note the "(mostly) equivalently" was a reference to the fact that Zig errors can't (currently) contain any other information, whereas an error in rust can carry other information.

Also note that the compiler can't infer the error type in all case, for example in the case of a recursive function. In that case you do need to explicitly write out the error set.

1 comments

dureuill 1203 days ago

Ah interesting approach, thx for the answer.

1. Doesn't that risk introducing accidental breaking changes by adding a new error to the set in the implementation, since the set of errors is inferred from the implementation? Having a compile error in this case in Rust is often the last barrier standing between me and an accidental major semver bump (since callers have to exhaustively match on the error conditions) 2. Can you have data in the variants of the error enumeration?

link

malcolmstill 1203 days ago

> Doesn't that risk introducing accidental breaking changes by adding a new error to the set in the implementation, since the set of errors is inferred from the implementation?

Do you have an illustrative example? (I'm not implying it's not possible, just trying to think of a good example so I can give a good answer)

> since callers have to exhaustively match on the error conditions

If it's exhaustiveness that you're referring to, zig will make you handle all the possible errors (you can still do a catch all type thing when handling errors, which has the potential to "hide" an error that you otherwise wanted to handle explicitly).

> Can you have data in the variants of the error enumeration

No, they compile to just integers. Essentially it compiles to the same as C function that returns an `int` representing the error (with your actual return type passed in as a pointer, say).

Another limitation of zig error values is I think they're globally scoped, so you potentially could have two libraries have clashing error names that you then can't differentiate (I don't know if there are any plans to try and resolve that).

I will say that this automatic error set inference gives writing zig code this lovely "flow", where I do some error checks at the top of the function and early return some errors (which I just invent the names of there and then) and then move onto the happy path of the function, happy in the knowledge that the error handling is already "correct" (in that if the error isn't handled it'll (typically) bubble all the way up to main and exit the program). Any refinement on how a specific error is handled, I can go back to an appropriate place in the call stack and handle it. I always feel like it's helping me write correct code.

link

dureuill 1203 days ago

> Do you have an illustrative example?

Illustrative, I don't know, but I'll try to give more context.

When writing a library, it is important that public items (like functions and enum) don't change between minor versions so that client code doesn't need to update their calls to the library.

Sometimes when refactoring code you end up modifying how a library function is implemented. Maybe it will now depend on some file being present on the system, while previously it wouldn't, meaning that the absence of that file adds a new error variant to this function.

In today's Rust, since the Error type of a Result is an explicit part of a function's signature, such a change is very noisy to the library's maintainer: it entails either modifying the signature of the public function to return a different error type, or modifying the Error type itself, which is also public.

When this happens, the change needs to be reconsidered: either you can defer it to later, provide an additional function with that new implementation and error variant, try to make it work with the error types you already have, or decide in that it actually warrants a major version bump, in conscience.

By contrast, if the set of errors of a function is inferred rather than part of its explicit signature, it means that modifying the implementation you can add a new variant without even realising it (for instance, by mixing the variant name with a variant returned by a sibling function that you thought was already used by this function) and break semver in a much more silent way.

I guess it also makes life harder for tooling, since it has to parse the implementation of a function (and all its subfunctions) to rebuild the set of errors, as opposed to simply parse the signature of the top-level function.

> Essentially it compiles to the same as C function that returns an `int` representing the error

That feels very limiting, I often use error types to e.g., attach data about the error. Is there a more general mechanism for sum types for when this shorthand doesn't apply?

link

malcolmstill 1203 days ago

I see you what you mean. Yeah, I suppose if you are writing a library you might want to be more deliberate in the error set. Maybe explicitly writing out the error set is what you want in that situation. Rewriting the example:

    fn someFunction(a: usize) MyError!usize {
        if (a < 10) return error.LessThanTen;
    
        const b = try anotherFunction(a);
    
        return 2 * b;
    }
    
    fn anotherFunction(x: usize) MyError!usize {
        if (x < 20) return error.LessThanTwenty;
    
        return x * 3;
    }

    const MyError = error {
        LessThanTen,
        LessThanTwenty
    }

> That feels very limiting, I often use error types to e.g., attach data about the error. Is there a more general mechanism for sum types for when this shorthand doesn't apply?

I agree it's limiting. It obviously is going to depend on your application, but I have largely done without annotating errors with extra information (that maybe speaks more to the seriousness of my zig projects than to that approach to error handling as being sufficient!).

One pattern I have used (in e.g. a parser) is additionally passing in a pointer to a sum type:

    fn parse(allocator: *mem.Allocator, tokens: Tokens, parse_error: *ParseError) !AST

The `parse_error` can be set if an error condition occurs. I concede that that's a little clumsy

link

tialaramex 1203 days ago

> In today's Rust, since the Error type of a Result is an explicit part of a function's signature, such a change is very noisy to the library's maintainer: it entails either modifying the signature of the public function to return a different error type, or modifying the Error type itself, which is also public.

For what it's worth, if you expect this might happen, you should give the enum the [[non_exhaustive]] attribute. This attribute says I, the implementer, know how many of these there are, and in my code I can exhaustively enumerate them e.g. in a pattern match, however, you the 3rd party programmer using this crate, must assume you can't know how many there are, and therefore must write a default match to handle others, even if there seem to be no others when you write it.

Once you do this, you don't cause a compatibility break by adding a new item.

link