Hacker News new | ask | show | jobs
by chombier 1514 days ago
I also remember struggling with this paper quite a bit at first. This is how I would explain catamorphisms to past me:

Suppose you have a recursive `tree` data structure, and another one `tree' x` where `tree' x` is just like `tree`, but with recursive occurrences of `tree` replaced with type x. So maybe something like, in Haskell-ish syntax:

    type tree = Leaf int | Node tree tree
    type tree' x = Leaf int | Node x x
Then one can easily write a `fold` over trees with type

    fold: tree -> (tree' x -> x) -> x
where the second parameter (called an algebra) evaluates the result over a tree' assuming recursive subtrees have been replaced with the result of their evaluation (this is the key).

Now the beauty of recursion schemes is that whenever `tree'` is a functor, you can get `tree` (as a fixpoint: `tree = tree' tree`) and `fold` for free (generally called `cata`), which can be seen by rewriting `fold` above using just `map`.

This skips a fair amount of boilerplate code you no longer have to touch when modifying recursive types.

2 comments

Minor nitpick, you don't get `fold` as written for free. You would still need to provide a function `toTreeF :: tree -> tree'` then compose that with `cata`, which you do get for free: `fold t alg = cata alg (toTreeF t)`.
Isn’t toTreeF just ‘out’ as in ‘newtype Fix f = In { out :: f (Fix f) }’ ?
Sorry my last comment was sort of wrong because I neglected to mention Fix (I'm going to claim that I was tired :) ).

They say a picture is worth a thousand words, so hopefully this clears up the difference betwene `out` and `toTreeF`:

    -- | Our basic binary tree type
    data Tree a = Leaf a | Node (Tree a) (Tree a)
    
    -- | A pattern functor for our binary tree
    data TreeF a r = LeafF a | NodeF r r
    
    -- | The fixed point of 'f'.
    newtype Fix f = Fix { unfix :: f (Fix f) }
    
    -- | Given an F-Algebra on 'f' we have a catamorphism on 'Fix f'
    cata :: Functor f => Algebra f a -> Fix f -> a
    cata alg fix = alg $ fmap (cata alg) $ unfix fix
    
    -- | We can sum the elements of a 'Fix (TreeF Int)' using 'cata'
    sumTreeF :: Fix (TreeF Int) -> Int
    sumTreeF = cata $ \case
      LeafF i x -> x + i
      NodeF l r x -> x + l + r 
    
    -- | To operate our binary tree we need to map from 'Tree a' to 'TreeF
    -- a'.
    toTreeF :: Tree a -> Fix (TreeF a)
    toTreeF = \case
      Leaf a -> LeafF a
      Node l r -> NodeF (toTreeF l) (toTreeF r)
    
    sumTree :: Tree Int -> Int
    sumTree = sumTreeF . toTreeF


I didn't compile this code but it should be correct.

    out :: Fix (TreeF a) -> TreeF a (Fix (TreeF a))
    toTreeF -> Tree a -> Fix (TreeF a)
If you're going to introduce a generic functor fixpoint then you'll probably want to define

     type Tree a = Fix (TreeF a)
so that `toTreeF` is just `unfix` (and is effectively free, as I mentioned above)
Yeah you could do that but now you have thrown away the recursive `Tree` type and are always working in `TreeF`. This is totally fine but is a different design decision from where we started.
Thanks for explaining!