Hacker News new | ask | show | jobs
Effectful Haskell: IO, Monads, Functors (slpopejoy.github.io)
84 points by brassybadger 3943 days ago
7 comments

Naming is one area where Haskell's academic background is more curse than blessing. Numerous tutorials and blog posts would never have been written if only functor was named mappable and monad was named computation builder. They're named by analogy to category theory, but their usage in computer science is not at all similar.
The Haskell monad typeclass _is_ (modulo technical details) a category-theoretic monad on the category `Hask`.

As Edward Kmett says in more detail in [1] - the point of calling a monad a monad is not to confuse the reader, but to unlock 70 years of documentation on the concept for the reader. How many programming concepts/tools/libraries do you use that have 70 years of documentation?

Being abstract is profoundly different from being vague.

[1]: https://yow.eventer.com/yow-2014-1222/stop-treading-water-le... (around the 20 minute mark).

To be fair, the great majority of that 70 years of documentation is generally going to be completely incomprehensible to most programmers not named Edward Kmett. And, of course, there's a lot of debate about how much practical use CT has in day-to-day programming. Not that I don't get your point.
> 70 years of documentation

This is probably more off-putting than you suspect

Ok, so there's 70 years of documentation for me to research monads but I don't care unless I want to do advanced work in Haskell.
How is the list monad a "computation builder"? Sounds to me like "programmable semicolons" and other nonsense: Monad is a typeclass that means very different things depending on what type you're using. It's the shape that has meaning: this level of abstraction is just going to be hard to grasp correctly no matter what you call it.

Likewise with "Mappable": in most other languages, `map` is only an operation on lists or list-like things. `ask` in `Reader` is implementable with Functor only: by what tortured metaphor does `Mappable` help you understand that? Also it sounds like Java. Yuck :)

Names are a bikeshed. Monad, Applicative and Functor have the advantage of at least being rigorous, I can't see any other name being better.

As a novice programmer used to node.js, C, etc., how the hell is one supposed to get started with Haskell? I've tried multiple tutorials, read many articles like this, and still can't follow along.

It makes me wonder who ends up using this language for anything serious in even the slightest time crunch...

The big not-so-secret secret is that you don't have to understand much of Monad theory to enjoy Haskell. If you want to use Haskell as a nice functional language with a great typesystem.

As spopejoy says, Monad is a typeclass, that means every Monad is a type on itself. So you you can get by just learning every monad for its use without ever needing to know the theory that binds them.

The Maybe Monad is just an option type, the list monad is just a list and the IO monad is just a sequence of operations that gets returned to the runtime from the main function. Who cares that they all share a typeclass?

I've heard good things about the opening sections of the WikiBook - and if it is hard to understand, ask questions and then improve it! :)

https://en.wikibooks.org/wiki/Haskell

Try elm (http://elm-lang.org/) first; it doesn't mention Functor, Applicative, Monoid, Monad anywhere.

You might come back to Haskell later and it won't seem like such a big deal. (I use Haskell for serious work and love that a quick patch doesn't suddenly start breaking things in 10 other places - it's a real time-saver!)

Try these: learnyouahaskell.com haskellbook.com
Their usage in Haskell/OCaml etc is precisely faithful to their category theoretic definitions as can be in a general purpose language.

This debate about naming monads is pretty tiresome after so many years, if one called it "computation builder" it wouldn't change their structure or convey any notion of the laws any better than term monad. A monad at it's core is a set of algebraic relations.

But most people take comfort in having almost meaningless names. I can't blame them, when I first saw FP, I was lost in a see of nonsense. After time you see that lots of things and names are just crutches, and that structures, shapes, patterns, recursion relationships are where to look for answers.
Can you elaborate a bit on the last sentence? What exactly are the relations and what is the set?
If you want a description of the monad laws in Haskell terms you can Google and find like 80 expositions on the topic of various depths.

If you want a mathematical exposition. "Category Theory" by Awodey page 265 is a concise description.

> Numerous tutorials and blog posts would never have been written if only functor was named mappable and monad was named computation builder.

If that were so, monad and functor tutorials would be as follows:

Functor tutorial: imagine it's called "mappable". End of tutorial.

Monad tutoral: imagine it's called "computation builder". End of tutorial

I disagree.

There are some major benefits to the Haskell community's approach of naming abstractions after the math (where a suitable mathematical abstraction exists).

First, as has been mentioned, there are existing treatments of the objects in question, some interesting results there can be ported over to programming, and intuitions there lead new and sometimes useful places. Some newcomers will even be familiar with the concepts already - this is very few people for monad, but far more for monoid and semigroup. It avoids erecting an unnecessary wall between programming knowledge and mathematical knowledge.

Second, it changes the character of a particular kind of discussion: there is never ambiguity about whether a newly considered operation on a type "really" is "appending" - is it associative? does it have an identity? you've got a monoid. This means it's very clear what you can and cannot assume around a particular interface. Questions about whether "functors" are well thought of as "containers" or "mappable" or whatnot are clearly only questions of pedagogy.

The monads that were harder for me to understand while learning Haskell had quite plain names like "State" and "Continuation". I don't think alternative naming would have helped.

By the way, while I'm not sure the monad concept would be useful for all languages, I believe monoid really should become common parlance. It's such a simple and useful idea.

A decent indication that it's not as simple as "well just call it x instead" is that each time I see a comment like this there is often a new "easy" word for monad. Yours is new to me, so I'll add it to the list:

Monads are hard, let's call them

- FlatMappable

- AndThenable

- Joinable

- Chainable (or, Daisychain)

- Computation Builder

Their naming is similar enough to leverage existing research literature, and knowledge, though. Those with a background in math can map some of their existing knowledge. Those without a math background have a more precise vocabulary to understand the concepts and research further. Even if they are a bit foreign, I prefer the attempt to be precise.
It's worth observing that "background in math" here probably implies postgraduate study in a relevant field, which very few people will actually have done. The average new math graduate with a bachelor's degree might have heard of category theory but probably hasn't studied it during an undergraduate course.

More generally, I think one problem the Haskell world has with attracting more developers is that it's dominated by people with a very "pure maths" mindset. The people developing and advocating Haskell often enjoy exploring the abstraction possibilities and interactions for their own sake, just as a research pure mathematician studying some form of advanced algebra might. Of course, there's nothing wrong with that, and as a platform for programming language research it's probably an asset to have a lot of such people involved. However, most other people, even those of a technical persuasion, do not find such a purely theoretical approach interesting. They want motivation for any theory they are learning and they want practical applications to show why it's relevant.

But most of the FP culture comes from abstract mathematics. Formal proofs (ml), Symbolic rewriting systems (lisp deriv), ... The issue is that people see programming as the semi concrete modification of devices (we need tangible at first), when these people saw programming as recursive logic over anything, that can be mapped later on actual (or virtual) hardware.
While it takes exposure to Category Theory to have knowledge of Monads in particular, a course in undergraduate level Abstract Algebra gives a good enough foothold to make the mathematical definitions of "monad" tractable.
A crude analogy would be mathematicians exploring math vs physicists using math.
My problem wasn't the word but the fact that the word is not adjective - originally I spent a lot of time thinking "where is the monad?". This of course is the wrong question to ask, but if the word was "monadic" instead, then I would've known that this is a property (like iterable, foldable, applicative etc), and realized I should think of typeclasses more in terms of interfaces rather than values.

Sure, their instances (implementations) are values (dictionaries of functions which are implicitly passed around) but thats beside the point when learning.

That's something you have to learn to do. It's like asking "where is the stack?" when you're just calling functions. If you don't develop some skill for seeing how your code implements underlying abstractions, you will be missing out. You can program without knowing what a stack is. But if you want to solve a problem using a stack, it helps to know that your function call stack can be 'overloaded' to solve stack problems, rather than building an explicit stack data-structure yourself.

It's harder than it has to be, but it's also more powerful than it has to be. Because against the right kind of problem, it'll be just barely powerful enough.

What I meant was something different. The difference is that saying that "IO is a monad" is wrong in the mathematical sense too. IO is not a monad - the combination of (IO, bind, return) is the monad. It only makes sense to say the above in Haskell where there can be only one instance per type.

The set of natural numbers N is not a monoid; {N, +, 0} is a monoid. The set may, at best, be "monoidal" (i.e. there exists associative binary operator <> and a set member ZERO such that for all elements of the set, ZERO <> x == x <> ZERO == x).

So as a beginner, it doesn't even help to try and read the "70 years of literature" on the subject, because what you read there does not match (I'm reading about a set, an operation and an element of the set - and all I have here is the actual set. Where are the operation and the element? Oh they are defined and passed implicitly for the type. Oh so the type isn't the monoid, its at best MONOIDAL)

"[T]he combination of (IO, bind, return) is the monad. It only makes sense to say the above in Haskell where there can be only one instance per type."

This is very important, and something that took me a bit to grok, and definitely got in the way of understanding for a while.

One thing you can do - and I don't know whether it makes sense to teach it this way - is think of the typeclass constraints as actual arguments (absent optimizations, that is literally the case anyway - GHC passes instance dictionaries). In that case, "the monoid" is that dictionary, and the type referenced is the carrier set, and a value of that type is an element of the carrier set.

> What I meant was something different. The difference is that saying that "IO is a monad" is wrong in the mathematical sense too.

It's relatively common to refer to the underlying set as a monoid (or whatever structure you're talking about) if it's clear from the context what the operations are, though.

> The set may, at best, be "monoidal" (i.e. there exists associative binary operator <> and a set member ZERO such that for all elements of the set, ZERO <> x == x <> ZERO == x).

That's a pretty useless definition, though, as every non-empty set trivially satisfies that condition (just pick any element as the zero element, and let the binary operation be the constant mapping to that element). Also, “monoidal” usually means a monoidal category, which is something very different from the underlying set of a monoid altogether.

> It's relatively common to refer to the underlying set as a monoid (or whatever structure you're talking about) if it's clear from the context what the operations are, though.

Relatively common even outside Haskell?

Yeah, the Haskell culture seems to have a problem with naming. It uses a lot of mathematical terms and a lot of alphabet soup identifiers. Part of the problem is just that naming highly abstract things is hard (what would be a meaningful name for the monadic return operation?) Another part is that the commercial programming world has developed a whole art of naming things so that a maintenance programmer can understand them easily, but the Haskell community doesn't want to use that art because it reminds them of Java or something.
I think your first part is intimately related to the second one. You can't use "maintenance programmer" names such as real-world analogues if your language is highly abstract; they carry too many incorrect assumptions to be precise.
Here's the next article in the series:

http://slpopejoy.github.io/posts/Effectful02.html

I was interested and pleased to note that the source for this post is available in Literate Haskell: https://github.com/slpopejoy/tatterdemalion/blob/working/pos...
This article has successfully caused me to understand what a monad is.
Something I've wondered about: How does Haskell not optimize away IO ()? In a pure, functional languages, a function that doesn't return anything can simply be eliminated — but of course it isn't.

I've always assumed that Haskell's compiler has a built-in rules about IO having side effects, but I never actually bothered to find out.

When I first started with Haskell, one of many small epiphanies was when I realized that the IO monad itself doesn't actually do anything. bind and return don't do anything except wrap and unwrap values, and there's nothing that checks if you are in some kind of "IO context" to permit things like file I/O. There's no magical "side-effectful computation chain engine" behind the scenes.

IO is the type, not the value. Something that takes IO and returns IO wouldn't be optimized out for the same reason a negation wouldn't be optimized out simply because it takes an Int and returns an Int.
I was referring to IO (), which is a type that can have only one value, () — hence my thinking that a compiler should simply optimize the call to (), since there is no other logical value that it can return.
The type isn't the same as its parameter. An IO () is not () just like Maybe () isn't (), and [()] isn't ().
I'm not sure I find that explanation satisfactory. This only implies that there is information not used by the compiler: The only conceivable value that the type IO () can have is still the value IO (). If the compiler knew this, then it could, and would, optimize it away. Is there a formal aspect of parameterized types that prevents this from — formally — happening?
There's no such thing as "the value IO ()". The type IO () has tons of possible values, which all have different internal representations:

1) The IO action that does nothing.

2) The IO action that prints the string "Hello World" and then returns nothing.

3) The IO action that reads a string from the console, reverses it, prints it back, and then returns nothing.

4) ...

It's a common misconception to think that IO X is a wrapper around X with some "type system magic" to prevent it from being used outside IO. It's nothing like that. For example, IO String is not a wrapper around a String, doesn't contain any String inside, and cannot be converted to String. Meditate on the fact that System.IO.getLine is not a function, but a value of type IO String. Then you will understand why IO () has tons of possible values.

You are asking the right questions, it just means you're ready to go one step down the rabbit hole. You need to carefully, and slowly, read SPJ's Tackling the Awkward Squad. You need to allow yourself to skim over the parts that are perplexing on first-reading and take them on faith. With more understanding you'll find even the answer given there not particularly satisfactory (in fact the paper points it out explicitly, trust me you'll know when you're in the precise section).

At this point you'll either accept the state of affairs, or want to go deeper down the rabbit hole, at which point you become more an academic than a programmer per se.

In a very precise sense what groovy2shoes is saying is exactly right, and you need to understand that truly strange weirdos inhabit IO () whereas, as you point out, only one thing, modulo non-termination, inhabits ().

"The only conceivable value that the type IO () can have is still the value IO ()."

There is some magic around the IO type, but it's not needed to answer your question.

If you look in GHC.Types in the package ghc-prim, you can find the definition of IO:

    newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
An IO action is a function from the state of the world, to the state of the world plus a value. So the compiler may know it's giving you (), but it doesn't know what the new state-of-the-world will be.
"We can see therefore how Monad offers strictly more powerful transformations than Functor. fmap can only transform an individual value “in place”, while >>= can return an empty list, or a list with more elements than the original. Functor is “structure preserving”, while Monad is “structure transforming”."

So for a list monad bind = flatMap fmap = map

is that correct?

Studying FP for quite a while first monad tutorial I come to fully understand finally.

> So for a list monad bind = flatMap fmap = map

> is that correct?

Yes, although Haskell calls "flatMap" "concatMap". Of course "return" is just "\x -> [x]", AKA "(: [])".

Instead of bind (">>="), you can implement Monads using "join" instead. For lists, "join" is simply "concat".

So is it possible to explain monads with Javascript ? or even LISP ? seems to me monads == haskell so if one doesn't understand haskell one can't understand monads , cause all monad tutorials are written in haskell. So I ask , what is the point of learning what a monad is if i'm writing some Java ?
Yes. Infact, a JavaScript Promise is pretty much the same as the IO monad* - the `then` method is a lot like `bind`. Promise.resolve is a lot like `return`

* almost the same to IO, because of recursive thenable assimilation, something that was added to promises but is totally short-sighted and unnecessary; and because promises are eager which means they don't represent the action to be executed, but the value of an already executed action

I find it interesting that people complain about Haskell's monadic IO, and yet nodejs event-driven IO is arguably the same idea but without the syntactic sugar.
A concise and good explanation thanks.
Note that the word "monad" itself refers just to the "shape" of the object (and some rules about that shape), not the functionality.

For example, an array can be a monad if we define a bind method as map + flatten

  Array.prototype.flatMap = function(f) {
    return _.flatten(this.map(f))
  }
which works like so:

  allArticles = authors.flatMap(author => author.articles);
return would be

  Array.of = function(val) { return [val]; }
The shape (type) is the same:

the promise's bind (then) method takes a function that takes a value and returns a promise, and returns another promise

the array's bind (flatMap) method takes a function that takes a value and returns an array, and returns another array

the promise's return (Promise.resolve) method takes a value and wraps it in a promise

the array's return (Array.of) method takes a value and wraps it in an array.

You can get the array method types (shapes) from the promise ones by replacing promise with array (if you decide to give the methods the same name).

The implemented functionality, however, is vastly different.

It is possible to implement monad-like behavior in javascript. But javascript does not have a type system to model them in.

Haskell is great because it's type system is rich enough to represent these concepts and not only that but enforce their usage (to a degree). Unfortunately this is like a drug, once you get a good dose of Haskell it's easy to crave an even richer type system with dependent types.

Yes, Fantasy Land is a specification of Monad (as well as other type classes) for JavaScript:

https://github.com/fantasyland/fantasy-land

For one example, Scala's `for` syntax is quite similar to Haskell's `do` notation, and lets you do some monadic things with `Option`, `Seq`, and other classes.

And then you can go much further into that world with the scalaz library.

This! Something that never fails to surprise me is that, while I have a mostly Java background for my day job, after we adopted Scala I often find it easier to understand/prototype a function in Haskell (my hobbyist language).

One of those things was Scala's `for` (and `flatMap`). It clicked for me once I noticed how similar it was to Haskell's `do` notation for Monads.

What's surprising to me is that it would seem Haskell has no right to be this helpful for understanding production code!

I enjoyed this explanation of monads in Ruby: http://codon.com/refactoring-ruby-with-monads
emacs lisp monad introduction:

https://www.google.fr/search?q=jvtoups+monads

I hope you know lisp just enough to adapt to the lisp-1 lisp-2 namespace.