Hacker News new | ask | show | jobs
by tel 2405 days ago
While I really like these blog posts for keeping things light and intuitive, they're also sprinkled with claims that go a bit too far.

    > Arrays, Objects, Maps, WeakMaps, and Sets are all algebraic data types
This isn't really true on a couple levels. Algebraic Data Types are described through a small set of composition (data) or decomposition (codata) operations. Arrays, Maps, and Objects might be explainable in this way (though, really, we'd just be modeling them with ADTs). WeakMaps and Sets really cross the line though. WeakMaps, by their nature, need to appeal to something much more than just ADTs and Sets require "quotienting" to unify sets which contain the same elements in various ways. You can build the "raw stuff" from ADTs, but you won't have something that really behaves much like these types at all until you add in something extra.

The emphasis on sum types is huge, but it leaves out something critical: "JavaScript doesn’t have a lot of built-in features for sum types" is true of user-defined sum types (and related to it just not having types all together), but it has great support for a few special sum types: integers, booleans, characters! These sum types are important to talk about because it helps make clear that we already work with sum types all the time! Our basis for distinction and choice is built atop them in any language.

Finally, while products and sums get you some wonderful modeling capabilities, the place where Algebraic Data Types really shine is when you add two further capabilities: function types (exponentials) and recursion (fixpoint, which doesn't really map to high school algebra). Those are clearly built into JavaScript as well and would be great to discuss as well.

4 comments

I really, really hope JavaScript eventually gets support for:

- User defined Sum types - Pattern matching on these Sum types in switch statements - Blocks and things-that-are-currently-statements becoming expressions that evaluate to a value.

If it got those, then it would pretty close to the perfect dynamic language for me.

Considering TypeScript ~success, python type annotations and even PHP looking into sum types it's not far fetched to think js may adopt them.
What is the benefit of pattern matching in a language where any value can be of any type at runtime? You can just pattern match by saying if (obj.type === 'foo') {}; if (obj.type === 'bar') {}; etc in JS, as long as you have set up the convention that you set the 'type' field, and practically this is just as good.

The real benefit of pattern matching is the compiler checking at least that you have valid cases, and ideally that you haven't missed any cases.

Pattern matching goes significantly beyond checking “foo.type”. It adds the ability to specify exactly (as exact as the language and your data allows) what case you’re dealing with. For example, a list of length three where the first and last elements are empty strings. Having to specify this with a series of if statements often leads to verbose and error-prone code, but it’s trivial in a language like reasonml:

    switch (mylist) {
      | [“”, _, “”] => x
      | _ => y
    }
Moreover, pattern matching usually involves pulling information out of the object at the same time. In the above example, maybe you want to do something with the middle element:

    switch (mylist) {
      | [“”, middle, “”] => middle ++ “!”
      | [“foo”, x] => String.reverse(x)
      | _ => “never mind”
    }
All of this would be possible to do with a series of if conditions, but significantly harder to read and implement correctly.

The power of pattern matching grows clearer as more complex data and rules are introduced. For example, it works just as well with nested lists:

    switch (mylist) {
      | [outer, [middle, [inner]]] => outer ++ middle ++ inner ++ “!”
      | _ => “never mind”
    }
While having a type system is helpful in all of the ways a type system is generally helpful, there’s nothing about this code which wouldn’t also be handy in a dynamic language like JavaScript. For further evidence of this see Erlang and Elixir, dynamic languages which make heavy use of pattern matching.

(Code typed on my iPhone so forgive the dopey examples)

Nevertheless, those special cases are the exception, not the rule. Most of the time the pattern matching really is just a more convenient switch statement.
I'm not sure what your experience is, but this seems to differ from mine (lots of recent Haskell experience, some dated OCaml).

I agree that matching on multiple levels in one statement isn't common, if that's what you were talking about. But performing bindings in a match is very common, and not something that's part of a switch statement (except in those languages where a switch statement is doing pattern matching more generally anyway).

Yes, I was just trying to say that matching on multiple levels is rarer. I agree with you that automatically binding new variables is the best part of pattern matching.
It's more than that though.

Scala example:

   myList match {
     case Nil => 0
     case first :: Nil => first * 3
     case first :: second :: Nil => first - second
     case first :: second :: rest => first * second + sum(rest)
   }
Seen ReasonML? Pretty cool. Does that with a JS-y syntax.
In what sense are integers and booleans sum types? What is the definition of sum types that booleans fit? Is it just that it can be true or false, and because there's the "or", it's a sum type?
> In what sense are integers and booleans sum types?

Natural = Zero | Successor(Natural)

Bool = True | False

The mapping from Natural to Integer is standard.

Assuming non-fixed width integers, how can you have an infinite sum type?
There’s nothing wrong with an infinite sum, in principle. You can’t usually define them unless you can use recursion. In the case of integers you can do

    data Int = Positive Nat | Negative Nat
    data Nat = Zero | Succ Nat
Though, that has 2 zeros, a positive and a negative one, but zero should be neither. To avoid that, you can do:

  data Int = Zero | NonZero Sign PositiveInt
  data PositiveInt = One | Succ PositiveInt
  data Sign = Positive | Negative
Good point.
You can't. The Natural number definition combines a sum type with a recursive position. Or in other words, a finite tag (either zero or successors, 2 options) plus a pointer to the next value in the case of successor.
> Is it just that it can be true or false, and because there's the "or", it's a sum type?

You can rote learn that, but the reason why it's intuitive to call them sum types is because you're adding the number of possible values. With product types, you multiply the number of possible values that the type represents. Consider these types:

  -- 1 + 1 = 2 possible values
  data Bool = True | False

  -- 1 + 1 * a = 1 + a possible values
  data Maybe a = Nothing | Just a

  -- 1 * a + 1 * b = a + b possible values
  data Either a b = Left a | Right b

  -- 1 * a * b possible values
  data Pair a b = Pair a b

  -- 1 * a * b * c possible values
  data Triple a b c = Triple a b c
`Maybe Bool` would have 1 + 2 = 3 possible values.

`Either (Pair Bool Bool) (Maybe Bool)` would have (2 * 2) + (1 + 2) = 7 possible values.

Integers would be defined as:

  data Int = ... | -2 | -1 | 0 | 1 | 2 | ...
if they were finite, but since they're not, they'd have to be defined recursively with something like:

  data PositiveInteger = One | Succ PositiveInteger
  data Integer = Zero | NonZero Bool PositiveInteger
where the Bool represents the sign.

Since 2 * x is the same as x + x, we can also define it as:

  data Integer = Zero | NonZero (Either PositiveInteger PositiveInteger)
where Left of Either would represent negative integers, and Right of Either would represent positive integers.
Summing is the act of gluing two components together with “or”. It’s possible to see all types as sums in this fashion (this is essentially “distributivity of addition over multiplication) but it’s also impossible to see char/int/bool as not having some sense of “or” within it.

For a product type with no “or”, consider the JavaScript object {}. We see these as “atomic”.

A boolean is either true or false - i.e., it's the type inhabited only by "True" plus the type inhabited only by "False". An integer is either 1, or 2, or 3, or 4, or..., i.e., it's the countably infinite sum of other types.
First, here is a note about the notation that I will use here: Let `1` denote the unit type, the type with one inhabitant. Let `()` denote the inhabitant of the unit type. Let `a + b` or `Either a b` denote the sum of types `a` and `b`. Let `Left` and `Right` be the constructors of the sum type.

Before defining the integers, one must define the natural numbers. Natural numbers are defined as an inductive type, which is a little different from an ordinary sum type:

    data Nat = Z | Suc Nat
Here is another way of defining them. Start by removing the self-reference:

    data NatF a = Z | Suc a
Notice that NatF is the same thing as the Maybe/Option type, AKA 1+a, AKA the sum of the unit type and a.

Given a "fix" type:

    data Fix f = MkFix (f (Fix f))
the definition of the natural numbers would be:

    type Nat = Fix NatF
An "F-algebra" is a way of defining some algebraic structure or data type as a map from a "sum of products" to the type. More precisely, an F-algebra consists of an "object" (type) A and a map F(A) -> A. For the natural numbers, the F is NatF and the map would be a function mkNat : NatF Nat -> Nat, AKA mkNat : (Maybe Nat) -> Nat or mkNat : (1 + Nat) -> Nat.

Related links:

- https://ncatlab.org/nlab/show/inductive+type

- https://ncatlab.org/nlab/show/natural+numbers+object

- https://ncatlab.org/nlab/show/initial+algebra+of+an+endofunc...

- https://en.wikipedia.org/wiki/F-algebra

---

One way of defining the integers is as the product of the natural numbers with themselves, or Nat * Nat, equipped with an equivalence relation ~ where

(a, b) ~ (c, d) := a + d = b + c

(a, b) can be understood as a - b. Of course, a - b = c - d <-> a + d = b + c.

Another way of defining the integers is as:

    data Int = Neg Nat | Zero | Pos Nat
where 0 = Zero, -1 = Neg Z, and +1 = Pos Z.

---

Booleans can be defined as 1 + 1. Left () = True and Right () = False, or vice versa.

I'm guessing at least a majority of the readers are like me and don't know Haskell, so on behalf of the confused:

  data Nat = Z | Suc Nat
^ What does this crazy construction mean syntactically?

It looks like it is trying to do something like (from your link) Coq's

  Inductive nat : Type :=
   | zero : nat
   | succ : nat -> nat.
But obviously defining natural numbers in terms of Z/the integers doesn't work because I could then call -1 a natural number.
You can't have negative numbers with that definition. Here are numbers and the way you would write them with that definition:

  0 == Z
  1 == Suc Z
  2 == Suc (Suc Z)
  3 == Suc (Suc (Suc Z))
You can define some arithmetic operators for it:

  data Nat = Z | Suc Nat deriving (Show, Eq)

  instance Num Nat where
    (Suc x) + y = Suc (x + y)
    Z + y = y

    (Suc x) - (Suc y) = x - y
    x - Z = x
    Z - (Suc _) = undefined

    Z * y = Z
    (Suc x) * y = (x * y) + y

    negate = undefined

    abs x = x

    signum (Suc _) = Suc Z
    signum Z = Z

    fromInteger 0 = Z
    fromInteger x
      | x < 0 = undefined
      | otherwise = Suc (fromInteger (x - 1))
The definition `fromInteger` is used by GHC to implicitly convert decimal numbers in the code to our Nat. We can use it like this:

  ghci> 0 == Z
  True
  ghci> 1 == Suc Z
  True
  ghci> 2 == Suc (Suc Z)
  True
  ghci> 3 :: Nat
  Suc (Suc (Suc Z))
  ghci> 3 + 2 :: Nat
  Suc (Suc (Suc (Suc (Suc Z))))
  ghci> 3 - 2 :: Nat
  Suc Z
  ghci> 3 * 2 :: Nat
  Suc (Suc (Suc (Suc (Suc (Suc Z)))))
  ghci> 3 - 4 :: Nat
  *** Exception: Prelude.undefined
  CallStack (from HasCallStack):
    error, called at libraries/base/GHC/Err.hs:78:14 in base:GHC.Err
    undefined, called at <interactive>:70:15 in interactive:Ghci8
In my code, the Z doesn't mean integer; it stands for zero. Nat is an inductive type, and Z is the base case. Z has type Nat.
The “or” makes it a sum type.
Doesn't that make everything (other than "bottom" or "unit") a sum type?
Sure, in the same sense that every set is a disjoint union over singletons of all its values. But whether you would or could naturally specify it that way or not depends on the set. You likely wouldn't sit down and specify the type "Function from natural numbers to Boolean" as a big OR over each possible such function, for example.

But yes, whether something is a sum type or a product type or whatever is not generally so interesting as a predicate about the type in itself. It's a relationship between this type and other types. The same way it is not interesting to say "7 is a sum" (everything is a sum); what's interesting is to observe "7 is the sum of 3 and 4".

[Disclaimer: In the same way that some topological spaces are connected and cannot be decomposed into separate disconnected components, one might sometimes reason that some types are connected and not reasonably represented as a disjoint unions over subtypes. So "Everything is a sum" is not always correct in all contexts. But let's ignore this sort of pedantry for now.]

Here's what's bugging me: Integers are a disjoint union of values, which is (in my non-abstract-algebra mind) something different than a disjoint union of types. That is, something like Maybe Integer is a "sum type" in a completely different way from the way that Integer is a "sum type". To me, this is different enough that a different term should be used.

I'm sure you're going to tell me that I'm not thinking abstractly enough...

There are sums at the type and value level. If I define Color = Red | Blue | Green, that’s not a use of a type which sums types together, but instead just is a type composed of a sum of values.

So a “sum type” can be any type which is the sum of different values. On the other hand, Either is a canonical “sum type (constructor)” which sums two types together (via the summation of their vaues).

If it weren't for finiteness restrictions, you could write the following in Haskell and it'd make a natural number type that acted basically as expected.

data Zero = Zero data One = One data Two = Two ... data NaturalNumber = Zero | One | Two | Three | ...

Whatever you would say about sets, you might as well say the same thing about types; the distinction in terminology doesn't amount to much. The natural numbers are the disjoint union of {0}, {1}, {2}, {3}, ... . They're also the disjoint union of the even naturals and the odd naturals, or the disjoint union of {0}, {1}, primes, and composites. There are a million ways in which they are a disjoint union.

I don't think it's your level of abstraction, I think it's that you are treating as unrelated two sides of the same coin. This sum type / variant / disjoint union / coproduct construction operates both at the level of types and the level of values. After all, we only care about types insofar as they tell us about values.
Indeed, maps and sets cannot be constructed from sums, products and recursion. They can be approximated if you throw exponentials in there.

Instead, you can extend algebraic data types to combinatorial species, which do allow exactly the kind of quotienting you speak of. While they're unexplored in programming, they are believed to be programmable similarly to ADTs.

https://repository.upenn.edu/cgi/viewcontent.cgi?article=177...

I must take issue with the "objects are products" characterization you implied. Structures are products; objects need existential types: http://www.cis.upenn.edu/~bcpierce/papers/compobj.ps

Yeah, I hesitated there. But at the same time, many JavaScript objects are just named tuples. That intuition is ok.
You can encode them as finger trees no? Which are algebraic
.oO( Just when you thought you understood algebraic data types, a real functional programming expert arrives to show you that you never will. )
Don’t mind my nitpicking. A huge part of algebraic data types arises from products and sums just as discussed here. As is always the case, there is hairiness between metaphor and precision, but presenting these ideas in JavaScript like this goes a long way!
Please don't do the "I'll never understand this" thing! Sure, it's possible that you never will, but mostly it's just about trying and practicing... IFAIUI the current scientifically supported thinking is that a good way to learn is to approach the subject from many different angles. Something to do with encouraging your brain to form lots of connections from different concepts to the concept you're trying to learn...

I never thought I would understand point set topology, but for a few years I really did. (That was quite a few years ago, and since I haven't practiced... Well, it atrophied. Still remember my favorite proof, though! I also remember what a nightmare proof Stone–Weierstrass was :) ... amazing, but still a nightmare.)

I actually have a decent practical understanding of sum types as they are used in Rust at least (including the tagged union implementation of enum). It's good enough to do useful work.

Every time that functional programming patterns get discussed, though, there is a mad rush by functional programmers to impress upon everyone just how complicated they are.

It is the same impulse that keeps the mathematical entries in Wikipedia accurate but utterly worthless for most people.

Absolutely, a practical understanding is enough. It also suffices for monads, monad transformers, etc.

> Every time that functional programming patterns get discussed, though, there is a mad rush by functional programmers to impress upon everyone just how complicated they are.

I... haven't observed this behavior personally. FWIW, I don't think that saying (for example)

> Hey, we can actually define what "|" and "(_,_)" means in a mathematical way

detracts in any way from a practical understanding. Can you elaborate on why you think it's detrimental? (I understand that it can come off as a little bit smug, but beyond that... meh?)

Asserting that an article is "sprinkled with claims that go a bit too far" calls into question what the reader may have learned. It does not build upon the article, like a popularizer would, but first tears it down.

Then, to restore the reader's sense of mastery over the material, it's proposed that they grok a significantly more esoteric perspective.

I'm just trying to understand... is your beef here with the particular choice of verbiage?

(FWIW, I think it's absolutely right to call into question what the reader may have learned if an article is misleading. We can argue about wording, etc., but accuracy is important, I feel. Now, Lies-to-Children[0] certainly have their place, but they must be fundamentally accurate even if simplified.)

[0] I think I first heard of this concept in the Science of Discworld books...?