Hacker News new | ask | show | jobs
by arialdomartini 687 days ago
A series of articles starting from the very scratch and getting to the State Monad. It's thought to be novice-friendly: although using F#, it assumes no knowledge of it. If only it aroused someone's curiosity around F#, that would make my day.

It shows how algorithms with mutable state can be implemented with pure functions, with immutable variables only.

The series itself is an excuse to take several detours on other Functional Programming topics: currying, partial application, recursion, Functors, Applicative Functors.

The source code includes F# and C# examples. In the next weeks it will be followed by code examples showing how to apply a state monad in practice.

3 comments

Not all the ways through it, but a good intro.

On first glance, your example:

    (decimal | DivideByZeroException) Divide(decimal n, decimal d) => n / d;
Isn't that far off from what is pretty easily done in C#:

    using System.Runtime.CompilerServices;

    (decimal? , DivideByZeroException?) Divide(decimal n, decimal d) {
      try {
        return (n / d, null);
      } catch {
        return (null, new DivideByZeroException());
      }
    }

    Divide(6, 2).Switch(
      (result) => Console.WriteLine($"6 /2 is {result}"),
      (ex) => Console.WriteLine("This is an exception!")
    );

    Divide(6, 0).Switch(
      (result) => Console.WriteLine($"6 /2 is {result}"),
      (ex) => Console.WriteLine("This is an exception!")
    );

    public class DivideByZeroException : Exception;

    public static class TupleExtension {
      public static void Switch<T, E>(this ValueTuple<T, E> tuple, Action<T> action1, Action<E> action2) {
        if (tuple.Item1 is null && tuple.Item2 is null) {

        } else if (tuple.Item1 is null) {
          action2(tuple.Item2);
        } else {
          action1(tuple.Item1);
        }
      }
    }
(Sharing for folks who aren't familiar with C# Tuple types and extension methods)
You'll be keeping a lot more info in your head with your approach. For instance, the following illegal states have legal representations in your system:

    (null, null)
    (result, new DivideByZeroException())
Hence your unhandled base case in the extension method. Sum types are a more accurate representation and more pleasant to use as a result.

(btw, you're looking through the Monads For The Rest Of Us tutorial, but the the State Monad For The Rest Of Us tutorial is the one linked to this thread)

I don't see those as necessarily illegal states, but even so, it's not hard to handle by adding a third `Action` for it (I kept the example short and left the first `if` condition empty).

The second case, if there is an exception, it would be handled as such by simply switching the order of the `if` in the extension method. None of this seems confusing nor onerous in day-to-day use.

What does it mean to receive both value and error in the context of division? What about neither?

The issue is not that they cannot be handled, it's that they can be. You included an entire extra branch and check for something that will never occur because they're legal values for your tuple type. I assume it'll be optimized out of this trivial example, but imagine writing a function to receive your tuple type in a codebase where you didn't create the tuple. ADTs prevent this entire worry because there is no way to represent illegal states.

    What does it mean to receive both value and error in the context of division? What about neither?
It's only "invalid" if you define it as such. We can state "if we receive neither, then this is the same as an error state". And "if we receive both, than an exception occurred and should be handled".

I don't see the issue?

The return type of a function is an assertion. You’re saying that values of the type exist. The return value is an assertion about the arguments. You’re saying that they imply the result. The issue is that your code is incorrect, in a logic sense, if it states that for some given value the mathematical operation of division results in both an error and a number.

The issue is not one of usability or (direct) understandability, it’s an issue of correctness. Something like “well it works fine for me if I just remember to check if I receive both” or “everyone on my team understands that receiving both means an error occurred” just isn’t a valid response, it doesn’t address the problem.

Now you have 3 different states that all represent an error:

    (null, new DivideByZeroException())
    (null, null)
    (result, new DivideByZeroException())
One of which does not actually use our custom error type, and one of which contains a value for some reason. Our function is simple division! Are there really 3 different ways division can fail?

This isn't a showstopper, it's just more stuff you have to keep in your head.

> I don't see the issue?

> if we receive both, than an exception occurred and should be handled

P(decimal ∩ DivideByZeroException) = 0 , therefore we only need 2 states, not 4.

You'd love Go because they also treat Sums as Ad-Hoc Product types.
Very cool, thanks for sharing. I wonder how easy it would be to port this to Rust, just for the heck of it...

One minor thought: where you have `A Tree is either a Leaf or a Node containing 2 branches each containing another Tree structure.`, perhaps you could have `A Tree is either a Leaf or a Node _made of_ 2 branches each containing another Tree structure.`

So that "made of" easily maps to the "of" in F#'s type definition. Just an idea.

Given that the book opens by defining a recursive datatype, it would be an exercise in frustration to port this to Rust...
How so? Can't you just put your Tree in a Box?

    enum Tree {
        Leaf,
        Node(Box<Tree>, Box<Tree>),
    }

    fn number_of_leaves(tree: &Tree) -> u32 {
        match tree {
            Tree::Leaf => 1,
            Tree::Node(left, right) => number_of_leaves(left) + number_of_leaves(right),
        }
    }

    #[cfg(test)]
    mod tests {
        use super::*;

        #[test]
        fn counts_the_number_of_leaves_in_a_tree() {
            let tree_with_3_leaves = Tree::Node(
                Box::new(Tree::Leaf),
                Box::new(Tree::Node(
                    Box::new(Tree::Leaf),
                    Box::new(Tree::Leaf),
                )),
            );

            let leaves = number_of_leaves(&tree_with_3_leaves);

            assert_eq!(leaves, 3);
        }
    }
Yes, and the frustration is on dealing with all the boxes and unboxing, at the exact correct place, and that won't unify with each other.

There's a sibling comment about porting it to C#. The Rust version will be less frustrating than the C#, but it is the same kind of bad.

On Rust specifically, this happens because Rust is a low level language.

The sibling comment looks way closer to port of Go which is rather unidiomatic C#. I understand where it is coming from, but there is a much terser and type-safe way to express it:

    #pragma warning disable CS8509 // The switch expression does not handle all possible values.
    // Justification: The switch expression is exhaustive, and Roslyn emits optimal default guard.
    // In AOT, ILC should be able to statically prove by whole program view that only two subclasses of Tree exist and optimize it away.
    // In practice, it seems null check and recursive nature of the call prevent it for now. It is still compact and fast codegen however.

    var treeWith3Leaves = new Tree.Node(
        new Tree.Leaf(),
        new Tree.Node(
            new Tree.Leaf(),
            new Tree.Leaf()));

    var leaves = NumberOfLeaves(treeWith3Leaves);
    if (leaves != 3) {
        throw new Exception("Huh, something went wrong.");
    }

    static int NumberOfLeaves(Tree tree) {
        return tree switch {
            Tree.Leaf   => 1,
            Tree.Node n => NumberOfLeaves(n.Left) + NumberOfLeaves(n.Right)
        };
    }

    abstract record Tree {
        public record Leaf: Tree;
        public record Node(Tree Left, Tree Right): Tree;
    }
Once we have type unions[0], the syntax could be easily promoted (assuming current proposal state) to e.g.

    union Tree {
        Leaf;
        Node(Tree Left, Tree Right);
    }
Or a bit lower-level variant:

    record Node(Tree Left, Tree Right);

    union struct Tree {
        Leaf;
        Node;
    }
This will allow to remove exhaustiveness suppression and not pay for (house of) leaves.

[0]: https://github.com/dotnet/csharplang/blob/18a527bcc1f0bdaf54...

Geez, that's way easier than I remembered. When I originally encountered this while learning Rust I figured recursion was my problem and switched to learning Haskell. (No regrets.) I guess I need to take my findings back to Rust!
Cool idea! Thank you for the suggestion. Fixed!
Happy to contribute even if in a minuscule way :-) FYI you changed the definition when it first appears, but not in the shaded box when you repeat it later in Chapter 1
For someone who's curious about these topics, do you recommend F# as a language to learn and apply them in practice? I've always thought of Haskell as THE functional programming language because of its purity and elegant syntax which looks very much like math. But lately I've seen people talk about F#, Clojure, ... and I wonder if Haskell is mostly an academic language whereas these are mostly practical?
Those two languages are popular because they bring some of the benefit of languages like Haskell into an environment that is compatible with an outdated but widely used OOP language.

Haskell will bring you more of the benefits from FP with expressive types; Idris will bring you even more. But you will lose the integration.

A lot of people mistake Purity as a language feature for meaning a more pure functional language. What makes a language functional is that their basic unit of abstraction is functions, not whether they isolate IO effects by default.
Impure functions lack referential transparency, which is often sacrificed but kind of important to being a function - getting the same result every time you call it with the same input. Alternatively, you can say that every function has an implicit input parameter of the the current program state, but that turns every function on one bit of input to a potentially astronomically complex function of gigabytes of input.
I do agree with what you're saying about the value of referential transparency and the formal definition of a function as a lambda term. In fact, I work professionally as a Haskell developer and highly value this aspect of the language design; equational reasoning is a really powerful tool.

My point is that functional programming as engineering practice and language family is broader then variants of pure lambda calculi and Haskell occupies one particular place in that broad space.

Haskell focuses on both, but prioritizes research. I have had several Haskell jobs and highly recommend it for the personal enjoyment.
Haskell is far from being an academic language. It’s used in industry and has very practical features and libraries.

Same as F#, OCaml, etc.

F# is not particularly practical as far as jobs go, but it benefits from the large C# ecosystem.

I'm primarily a C# dev, but after dabbling in some F#, I concluded that most of the niceties in F# could also be done (to some extent and with some limitations) on C# as they have the same compile targets.