Hacker News new | ask | show | jobs
by duped 1001 days ago
> But this is just a generic sum type?

No, it's actually less generic. It's not determined by position but by type. For example `A | B | A` is the same type as `A | B`.

This is useful as a shorthand when you don't want/need a new type to represent your problem, similar to tuples.

1 comments

So you're talking about untagged unions?

> This is useful as a shorthand when you don't want/need a new type to represent your problem, similar to tuples.

Yes this is handled perfectly by the generic sum type, you don't need untagged unions for this. Rust used to have Either in its standard library, but they removed it and kept Result only. Semantically they're the same (a ⊕ b) but Result's name implies it has something to do with some "results". Anyways nothing stops you from creating one yourself, or even using Result if you're fine with the weird-sounding name.

No, I am not talking about untagged unions. I don't understand your notation. To be concrete, I am talking about tagged, disjoint union type, that does not require naming a new type to use.

This is also not covered by the Either/Result type.

Rust supports untagged unions, but they cannot be matched (because they have no tag). An anonymous union would still be tagged internally, but would be less general purpose than the generic enum type.

> To be concrete, I am talking about tagged, disjoint union type

But you just said "For example `A | B | A` is the same type as `A | B`". How would this be possible for tagged union types?

> that does not require naming a new type to use

> This is also not covered by the Either/Result type

It's more probable that I'm just not understanding what you're talking about, but *the only* re-usable tagged union type similar to tuples is *the* sum type.

Let's say you're dealing coffee. People want it either with sugar or without sugar. You don't want to create a new sum type CoffeeFlavor? Fine, just use Either<Sugar, NoSugar>. This is *the* equivalent of a tuple. You need more than 2 options? No problem, Either<Sugar, Either<JustABit, NoSugar>>. I don't know what else could be a "anonymous tagged union".

Ah ok I think we're mixing up terms here - in the context of systems programming languages, a "tagged" union refers to an integer in front of a bag of bytes that holds the data of the "un tagged" union. Rust has both tagged (enums) and untagged (unions) union types.

What you're asking about is a discriminated vs non-discriminated union, and indeed, that's exactly what I'm talking about.

A | B |C is not the same type as Either<A, Either<B, C>> because Either<A, Either<A, B>> cannot type check as Either<A, B>.

But even if you want to argue that you can represent things that way, it misses the point. The goal is to remove complexity from the type hierarchy of the program, not add to it.

> because Either<A, Either<A, B>> cannot type check as Either<A, B>

Why would you want the former to type check as the latter? Where do you see the complexity?

Because your code might not need to care about the position you insert your A or B (left/right for Either), you also might not care whether it's an (encoding as) Either<A,B> or SomeoneElsesEither<A,B> and you also don't want to have to deal with flattening nested Either's as in the example.

These types are also called "set-theoretic" types as A|B means exactly the set of all values that can be typed as A or typed as B - note that this also induces a whole subtyping rule by set inclusion and this is in contrast to sum types where a value typed Either<A,B> can never be typed A or B - to move between them you need to apply extractors/match/constructors/(not sure of standard type theory nomenclature).

Implementation of these union types might still need additional tags and construction/matching/extraction underneath, but from a programming perspective there's less complexity as compared to involving an additional named type Either (or EitherOf3 and EitherOf4 and ...) and manually implementing set-theoretic laws.

Consider this code:

    fn foo () -> A | B | C {
        if condition {
            bar();
        } else {
            baz();
        }
    }

    fn bar() -> A | B {
        ...
    }

    fn baz() -> B | C {
        ...
    }

vs

    fn foo () -> Either<A, Either<B, C> {
        if condition {
            match bar() {
                Either::Left(a) => Either::Left(a),
                Either::Right(b) => Either::Right(Either::Left(b)),
            }
        } else {
            match baz() {
                Either::Left(b) => Either::Right(Either::Left(b),
                Either::Right(c) => Either::Right(Either::Right(b)),
            }
        }
    }

    fn bar() -> Either<A, B> {
        ...
    }

    fn baz() -> Either<B, C> {
        ...
    }
The latter code composes poorly and requires an extra branch at runtime. It is fundamentally more complex to dispatch on nested discriminated unions instead of flat non-discriminated unions both for the programmer to write, read, and for the runtime to execute.

The compiler can also optimize the representation of the anonymous enum based on the context in which its created, whereas its more difficult to do that in the discriminated case.

This isn't a controversial opinion, there are mountains of Typescript written in this style.