Hacker News new | ask | show | jobs
by tombert 768 days ago
Interesting.

Algebraic Data Types are almost always one of the things I miss when I use imperative languages. I have to do Java at work, and while I've kind of come around on Java and I don't think it's quite as bad as I have accused it of being, there's been several dozen instances of "man I wish Java had F#'s discriminated unions".

Obviously I'm aware that you can spoof it with a variety of techniques, and often enums are enough for what you need, but most of those techniques lack the flexibility and terseness of proper ADTs; if nothing else those techniques don't have the sexy pattern matching that you get with a functional language.

This C extension looks pretty sweet since it appears to have the pattern matching I want; I'll see if I can use it for my Arduino projects.

4 comments

Everyone who hasn't used ADTs and pattern matching doesn't get what the big deal is all about. Everyone who is used to ADTs and pattern matching doesn't get what the big deal is all about, until they have to work in a language that doesn't have them. And everyone who just found out about them can't shut up about them being the best thing since sliced bread.

:)

I have mainly used them in Rust. They are nice I suppose, but nothing mindblowing.

To me it feels very similar to an interface (trait) implemented by a bunch of classes (structs). I have multiple times wondered which of those two approaches would be better in a given situation, often wanting some aspects of both.

Being able to exhaustively pattern match is nice. But being able to define my classes in different places is also nice. And being able to define methods on the classes is nice. And defining a function that will only accept particular variant is nice.

From my perspective a discriminant vs a vtable pointer is a boring implementation detail the compiler should just figure out for me based on what would be more optimal in a given situation.

> I have multiple times wondered which of those two approaches would be better in a given situation, often wanting some aspects of both.

ADTs are closed to extension with new cases but open to extension with new functions, eg. anytime you want to add new cases, you have to update all functions that depend on the ADT, but you can add as many functions for that ADT as you like with no issues.

Traits are open to extension with new cases but closed to extension with new functions, eg. you can add as many impl as you like with no issues (new cases), but if you want to add a new function to the trait you have to update all impl to support it.

They are logical duals, and the problem of designing systems that are open to extension in both cases and functions is known as the expression problem:

https://en.wikipedia.org/wiki/Expression_problem

I suppose this is a genuine dichotomy, but I feel like it’s missing a more critical difference: ADTs cleanly represent data, even when nothing can be, or needs to be, extended from outside.

For example, a result is a success value or an error. A stock order is a market order or a limit order, and nothing else, at least until someone updates the spec and recompiles the code. Situations like this happen all the time. I don’t want to extend a result to include gizmos in addition to success value or errors, nor do I generally want to extend the set of functions that operate on a certain sort of result. But I very, very frequently want to represent values with a specific, simple schema, and ADTs fit the bill. A bunch of structs/classes, interfaces/traits and getters/setters can do this, but the result would look like the worst stereotypes of enterprise Java code to accomplish what a language with nice ADTs can do with basically no boilerplate.

> For example, a result is a success value or an error. A stock order is a market order or a limit order, and nothing else, at least until someone updates the spec and recompiles the code.

But that's just it, specs are rarely complete because reality is fluid. For example, a result is a success or an error, until maybe you want an errors to prompt the user to correct something and then the computation can be resumed (see resumable exceptions).

Should you even have to recompile your code to handle new cases? Why can't you just add the new case, and define new handlers for the functions that depend on your ADT without recompiling that code? That's the expression problem.

I want more syntax sugar for my ADTs that mirror what traits have. I don't need that kind of double extensibility.
> And defining a function that will only accept particular variant is nice

This is possible to achieve (or hack your way through, if you will) by parameterizing the type and using a nullary type (a type which is impossible to have) to exclude specific cases of a sum type. In Haskell this would look like this:

    data Weather a b c = Sunny a | Rainy b | Snowy c

    -- can't snow in the summer!
    onlySummerWeather :: forall a b. Weather a b Void -> String
    onlySummerWeather weather = case weather of
      Sunny _ -> "Got sunny weather"
      Rainy _ -> "Got rainy weather"
      Snowy v -> absurd v
where `absurd :: forall a. Void -> a` "if you give me something you can't ever have, I will give you anything in return".
By floating the forall., we get another representation type for Void (forall a. a) and `absurd' is one half of that isomorphism :)

    absurd :: Void -> (forall a. a)

    drusba :: (forall a. a) -> Void
    drusba void = void @Void
I would be so tempted to name the variable 'ity' instead of 'v'.
Enums are closed sets and trait objects are open sets. They are conceptually related concepts, but the language puts syntactic distance between the two, and I don't think it should.

There are lots of open design questions for every feature you propose, but all of them have been discussed and have higher or lower chance of making it into the language.

That's kind of greek to me, but shouldn't promising the compiler that my set is closed unlock more features instead of taking features away?
I'm not sure I understand your point, but I'll elaborate on ways that we could "homogenize" the two features:

---

We could add implicit enums to impl Trait, so that you could return different types from a function:

    fn foo() -> enum impl Display {
        if rand() > 0.5 {
            "str"
        } else {
            42
        }
    }
which would let you get around the problem of returning a type erased object for a Trait that isn't object safe:

    trait Trait {
        const C: i32 = 0;
    }
    impl Trait for i32 {}
    impl Trait for &'static str {}
    fn foo() -> Box<dyn Trait> {
        if true {
            Box::new("")
        } else {
            Box::new(42)
        }
    }

    error[E0038]: the trait `Trait` cannot be made into an object
     --> f500.rs:6:17
      |
    6 | fn foo() -> Box<dyn Trait> {
      |                 ^^^^^^^^^ `Trait` cannot be made into an object
      |
    note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
     --> f500.rs:2:11
      |
    1 | trait Trait {
      |       ----- this trait cannot be made into an object...
    2 |     const C: i32 = 0;
      |           ^ ...because it contains this associated `const`
      = help: consider moving `C` to another trait
      = help: the following types implement the trait, consider defining an enum where each variant holds one of these types, implementing `Trait` for this new enum and using it instead:
                &'static str
                i32
---

Relax object safety rules, like making all assoc consts implicitly `where Self: Sized`.

---

We could make enum variants types on their own right, allowing you to write

    fn foo() -> Result<i32, i32>::Ok { Ok(42) }

    let Ok(val) = foo();
There's some work on this, under the umbrella of "patterns in types". For now the only supported part of it is specifying a value range for integers, but will likely grow to support arbitrary patterns.

---

Having a way to express `impl Trait for Enum {}` when every `Enum` variant already implement `Trait` without having to write the whole `impl`.

---

Anonymous enums:

    fn foo() -> Foo | Bar | Baz
---

Being able to match on Box<dyn Any> or anonymous enums

    match foo() {
        x: Foo => ...,
        x: Bar => ...,
        _ => ...,
    }
---

Stop needing to create a new struct type in order to box a single variant

    enum Foo {
        Bar(Box<struct { a: i32, b: i32 }>),
    }
---

These are of the top of my head, there are many things that you can do to make trait objects and enums feel closer than they do today, to make changing the way your code works a "gradient" instead of a "jump". My go-to example for this is: if you have a type where every field is Debug, you can derive it. As soon as you add one field that isn't Debug, you have to implement the whole impl for your type. That's a "jump". If we had default values for structs you could still use the derive by specifying a default value in the definition. That makes the syntactic change "distance" be as far as the conceptual change "distance".

A trait is a collection of variants that may or may not have unknown members. An enum is a collection of variants that may not have unknown implementations. So enums are in some sense a subset of traits. Hence every property of traits is also a property of enums. Does that make sense?

Those suggestion of your look interesting, but I haven't thought them through enough to have an opinion.

> They are conceptually related concepts, but the language puts syntactic distance between the two, and I don't think it should.

Haskell's type classes are a bit like Rust's traits. Type classes are open by default, but optionally can be closed off.

You might be interested in taking a look at OCaml's extensible sum types which may straddle the line in the way you're describing.
> To me it feels very similar to an interface (trait) implemented by a bunch of classes (structs)

Then you might not fully grok sum types yet.

> From my perspective a discriminant vs a vtable pointer is a boring implementation detail the compiler should just figure out for me based on what would be more optimal in a given situation.

Disagree. It's a design choice that should be decided by the programmers. There's a tradeoff—choosing which should be easier: adding a new variant or adding a new function/method. It's called the Expression Problem: https://wiki.c2.com/?ExpressionProblem

I’m in the latter camp (from Ocaml) and now using Go. Go feels clunky and awkward.
That's because Go is intentionally clunky and awkward in the name of "simplicity". IMO it's charming to some degree, but it's far from perfect and I think you'd need some pretty serious threats to get me to describe it as "elegant" in any way.

Rust somehow has more elegance than Go, if only in small parts. Nothing compares to Scheme in the elegance category IMO :)

My view of Go went from being annoyed to "they picked their tradeoffs carefully and consciously and then leaned into them hard to squeeze every ounce of upside out of them" and while it's still not really my preferred aesthetic I now have a lot of respect for the design as a design process.

Meanwhile, getting addicted to scheme many years ago is a lot of why I still mostly write perl, I need 'my' (i.e. expression/block level scoping and you actually have to declare your variables to scope them before use) and closures for the style of programming I like to do.

Rules out python, ruby, and pre-ES6 JS, though JS with 'let' and arrow functions is pretty survivable and will be even more so if the 'do' and 'match' proposals land.

(note: I know go has :=, and while I'm ambivalent about the syntax, the semantics are pretty much what I expect; the above was talking about scripting class languages though hence not mentioning it there)

Despite what I may have made it seem, I actually quite like Go as a language, I just wouldn't call it elegant or pretty (though certainly not as ugly as some other languages out there), but it's very pragmatic.

It's been a long time since I've written any Perl, but I can say I don't care for it as much as I do Lisps in general.

Many people don't care for it, and I made peace with that many moons ago; personally, I'm quite fond of what I'd call 'applications perl', which at least when I write it tends to be a mixture of OO and function composition type code with as much of the state as possible pushed to the outer edge (plus async for I/O if applicable), etc.

It barely feels like the same language as sysadmin scripts are written in, it just happens to share a compiler and a VM.

... and it doesn't exactly even share syntax since perl supports libraries defining their own keywords, e.g. while I won't be surprised if somebody makes it a feature of the VM eventually, our async/await is currently provided by https://metacpan.org/pod/Future::AsyncAwait and as you can see at https://metacpan.org/pod/Future::AsyncAwait#WITH-OTHER-MODUL... there's quite a few other useful ones out there.

Of course, many people complain vociferously about the idea that you have to use *libraries* to get syntax, but as somebody who still loves scheme I regard the ability to build the language up towards the problem to that extent to be a feature.

My first ever cpan release was making continuations pass back out to perl as a subroutine reference from Guile so I could write linear top level logic in scheme and use continuation capturing to sit that atop an event driven I/O stack - axisofeval.blogspot.com's lispx (https://github.com/lispx/lispx/) provides an fexpr based lisp that does similar for JS.

(trying to think of a good example of perl code of mine that shows an example of this is annoying me because I don't have any public code right now that's new enough for me to not have already decided I don't like it anymore ;)

Scheme (and by Scheme, I mean Racket, I guess) is nice, but it's not particularly elegant compared to eg Haskell.
Sealed interfaces in java 21 allow pattern matching
Yeah I know, we just don't use Java 21 at work yet. I'm super excited for that update, and it actually looks like we will be transitioning to that by the end of the year, but I haven't had a chance to play with it just yet.

I do find it a little annoying that it's taken so long for Java to get a feature that, in my opinion, was so clearly useful; it feels like they were about a decade later on this than they should have been, but I'll take whatever victories I can get.

If you can enable preview features, you can use pattern matching since Java 17 (though the final syntax in Java 21 was slightly changed - still you may want to use preview features, it's mostly fine in Java as they tend to change very little, and when you do upgrade, the compiler will tell you where you need to update your code).
Kotlin is JVM compatible and has ADTs.

Java has https://github.com/functionaljava/functionaljava

which is unsupported but stable.

Sure, and Scala has had ADTs since its inception as well I think, and that's also JVM. It's not ADTs, but Clojure does have some level of pattern matching/destructuring as well.

It wasn't that I though that the JVM was incapable of doing something like an ADT, just that vanilla Java didn't support it. While it's easy to say that "companies should just use Kotlin", that's a bit of a big ordeal if you already have a 15 year old codebase that's written in Java.

I've heard of but never used the Functional Java library, though it'd be a tough sell to get my work to let me import a library that hasn't been updated in two years.

> that's a bit of a big ordeal if you already have a 15 year old codebase that's written in Java.

JetBrains has prioritised compatibility with Java and it shows. Of course, there are some gotchas (such as nullability or checked exceptions which don't exist in Kotlin), but you can really mix Kotlin and Java code relatively freely.

Java 21's pattern matching (you don't need functionaljava, and shouldn't really use that unless you're really into FP) is kind of nicer than Kotlin's, because you can automatically "destruct" records in your matches.

For Java, see https://www.baeldung.com/java-lts-21-new-features

Kotlin's: https://www.baeldung.com/kotlin/when

Make up your own mind.

> kind of nicer than Kotlin's, because you can automatically "destruct" records in your matches.

I find positional destructuring of records a bad idea.

https://news.ycombinator.com/item?id=31399737

> Algebraic Data Types are almost always one of the things I miss when I use imperative languages.

Algebraic data types and pattern matching actually work really well in imperative languages, too. See eg Rust.