Hacker News new | ask | show | jobs
An Overview of the Monad (functional.christmas)
95 points by kvalle 2390 days ago
20 comments

> we can think of a monad as a datatype with

Aaaaaaa

Be more precise about instance relations! They are the first source of confusion!

- String is a type. The value "abcde" is an instance of that type.

- Num is a typeclass. The type Int is an instance of that typeclass. The value 12345 is an instance of that type.

- Monad is a typeclass whose instances are generic types. The generic type List is an instance of that typeclass. The type List Int is an instance of that generic type. The value [1,2,3] is an instance of that type.

Explaining the operations of Monad is only useful when you're comfortable with "value belongs to type which belongs to generic type which belongs to typeclass".

And if you think about it, you're just asking the author to be clear about the types of the things they are talking about.
Is there really a need to distinguish between types and typeclasses though?

You can have a notion of type and inheritance where Monad is a type, and Lists, Promises, Option, etc. are subtypes of Monad.

My problem with understanding monads was as follows: I had no problem with understanding the mathematical definition, it was easy to check that a given monad verifies the axioms.

No, my problem was this one: everyone was saying "hey, you know if you have a purely functional language, it would be impossible to have IO, since e.g. a read function would have different outputs every time. We solve that with monads."

But that didn't feel like an explanation. I couldn't draw a line from the monad axioms to dissolving that impossibility. It was like saying "Einstein says we can't do FTL, but we can solve that with monads".

The way most people explain this is completely backwards. Monads don't actually solve the I/O problem. So forget about monads and just consider how to mathematically model I/O.

Take something like getStrLn (which reads a line from STDIN). What is "getStrLn", as a mathematical object? It's not a string, rather, it's some kind of "I/O operation" that when executed, returns a string. Haskell calls such an object an "IO String". Similarly, the expression (putStrLn "foo") is an I/O operation that when executed, doesn't return anything. Haskell calls this an "IO ()".

Once you develop an intuition of I/O operations, you can see how you might want to combine atomic I/O operations into larger and larger I/O operation, feeding the output from one as input into the other. Or how you could treat a plain Int as an "empty" I/O operation that when executed, simply returns the Int. And how your entire program, its "main" entry point, can be modeled simply as one large I/O operation.

You can do all of that, and write every possible kind of I/O code in Haskell, without ever learning about monads.

What are monads for, then? They generalize the structure of this approach, with its (>>=) and return operators, and apply it to other cases that don't have anything to do with I/O. But this step is completely optional! This is very similar to how the idea of a mathematical "field" generalizes the arithmetic operators (+, -, *, /) and applies them to objects that don't look anything like the rational numbers. You don't have to learn about fields to successfully use rational numbers. And we would never dream of explaining the abstract idea of fields to people before we introduce them to rational numbers. Yet almost every I/O tutorial in Haskell begins with a half-baked explanation of monads.

That is how I finally got it, because I knew what groups/rings/fields were but did not realize at first that monads were just another abstraction like those and not an ontologically fundamental category. The way monads are usually presented leads you to believe that there is something magical in those axioms that somehow makes the impossible happen.
Learn the concrete, then abstract? That's sound wisdom about teaching most anything, in my opinion.
There’s an incredibly straightforward and readable paper by Simon Peyton Jones (one of the creators of Haskell and GHC) which explains how Haskell deals with IO, exceptions, and concurrency. It also explains why they settled on this, rather than some other design. In my opinion, it is the best explanation of the IO monad (specifically IO) out there. Even just reading the first 10 or so pages is completely worthwhile.

Paper: https://www.microsoft.com/en-us/research/wp-content/uploads/...

Here's another upvote for Tackling the Awkward Squad. A really nice paper. No surprise, since the author is the great Simon Peyton Jones :)
The solution is not really depending on monads. You pass "the world" as a parameter to read. If you pass a different world, you will get a different output. All function which interact with the world get a "world" parameter and returns a new modified "world". So now all IO operations are pure!

Monads are used as a trick to hide the "world" argument, so you can't cheat by e.g by reusing the same world twice. Maybe something like ownership in Rust could be used for the same effect?

Yep, if you have linear types then you could model IO with functions that explicitly accept and return the world.
I thought Monads were needed for IO to make sure IO operations happened in the correct order?
No, you can force operations to execute in linear order without the use of monads, as long as one operation depends on the output of the previous. But you have to be careful not to reuse values.

Imagine if the IO operations took an explicit world parameter instead of using a monad:

    main :: World -> World
    main world1 =
       let world2 = putStrLn world1 "Hello, what's your name?"  
       let (world3, name) <- getLine world2
       let world4 = putStrLn world3 ("Hey " ++ name ++ ", you rock!")  
       in world4
The IO operations would be executed in the expected order due to the dependency of the "world" output of the previous operation. Monads not necessary.

But if you accidentally used world2 twice then you would split the universe into two timelines. A monad can avoid this issue by hiding the world parameter and passing it on behind the scenes.

what actual concrete, real world, non sci-fi sounding problem does "splitting the timelines" represent?
It would just mean that IO operations would happen in random order or not at all, due to the laziness of Haskell. E.g. if you call getLine twice with the same "world" argument, then it would presumably only be executed once. And if you don't "consume" the returned world state from an IO operation, then the operation would not be executed at all.

Haskell does allow you to do something like this with the unsafePerformIO "backdoor".

The documentation states: "If the I/O computation wrapped in unsafePerformIO performs side effects, then the relative order in which those side effects take place (relative to the main I/O trunk, or other calls to unsafePerformIO) is indeterminate"

A less poetic way of stating it: if you re-use an already-used world state, you are violating a logical constraint of the system. No timelines will be violated, you just have an invalid program.

World states are supposed to consumed once only: they are "linearly typed" in fancy language. But Haskell cannot directly enforce the linear-type constraint on world states, so in theory these states could be used more than once. Fortunately, the IO monad encapsulates (hides) the states, so that you can't make this mistake in your application code -- you can't reuse what you can't access.

World-states in Haskell aren't first-class values. They aren't actual objects that are being passed around during the execution of the program. They are more like tokens that exist during type checking, but are erased when the program is compiled or interpreted.

Another way of phrasing it is that Haskell interpreters/compilers have specialized support for the IO monad: during type checking, it acts as if these states exist, but they have no operational meaning in the executed program.

Due to HN's limitations, I can't reply to your reply, but I can reply here instead. :)

I'm not sure this is a helpful metaphor, but I don't think it's completely awful:

Think of a world state as being a variable representing a timestamp. Until the state is used up, the variable is empty.

Suppose I read from a file. I get a brand new state back from the file-reading action -- let's call it WorldState1 -- whose contents are "The time is now X." We don't have a value in X, it's just a variable. (You could think of it as "The time is now <NULL>" if that makes more sense.)

Later, when we use our world state in a subsequent action, X is resolved to an actual time. Now, WorldState1 means "the time is now 09:57:53.00001".

Later in your program (say, 09:57:53.00999), you try to reuse the same state. The system tries to write the current timestamp (.00999) into the state variable... but it fails, because the state already contains a value. The system requires that these variables can only be written once.

If these states existed at runtime, you would get a runtime error in your program. Instead, the type-checker simulates the execution of your program during compilation, realizes that the state would be written twice, and raises a compile-time error.

Again, it's just a metaphor. But hopefully it helps to get an intuition about these usable-once-only values.

-----

The problem with metaphors is that there are so many of them, and they are all wrong. :)

I just thought of another that might click better. Think of a world state as a container that is holding a little bit of potential energy. Like a tiny battery.

Every time you take an I/O action, you need to power it up. An I/O action converts potential energy into kinetic energy.

Once you use up a world state in an action, the battery is drained. It can't be recharged, so the battery is just dead.

Fortunately, each I/O action returns, along with its output, a brand new little battery that you can pass on to the next action.

At any point in your program, you have exactly one charged battery on hand (and a pile of used-up ones). So it only makes sense to use the charged one in your next I/O action.

Yes, this is the correct answer. For everything else you don't need monads. Monads impose an evaluation order, which Haskell by default does not have, since it is lazy.
Monads do not impose an evaluation order in Haskell. Monads let you thread state between computations, but that state may be lazily evaluated out-of-order.

The belief that monads sequence IO operations is why new Haskell programmers have difficulty writing functions like "open file, read file contents, close file, and return contents". They assume the contents will be read before the file closes, and not when the contents are lazily evaluated up the call chain (leading to a "can't read closed file" error).

> hey, you know if you have a purely functional language, it would be impossible to have IO, since e.g. a read function would have different outputs every time. We solve that with monads. But that didn't feel like an explanation.

In a pure language you can’t have functions that return the contents of a file, or prints to stdout, or anything like that.

So what do you do? Instead of performing those computations in you program, you have to return (in the main function) a bunch of lambdas chained together that will be executed by the runtime that called the main function in the first place. Inside this chain of lambdas you will probably call the rest of your program.

The Haskell function you can call that “prints a string” doesn’t do much: it just returns some dummy structure (similar to an AST node, but you can chain your pure functions there too!) that will be read and executed later by the runtime. All the work is done later!

Your “main” in Haskell returns immediately, it doesnt do much. It does zero IO.

This is similar to how promises work in JavaScript: you return a structure instead of the result. Also similar to how React works (If you know its internals) but don’t let that distract you!!!

This is how the IO monad works.

People later discovered that this kind of structure could be used to a lot of stuff, and other monads (State monad, Maybe monad, Free monad) were born.

But these others don’t use the IO runtime.

Also look up “functional core imperative shell”, this is how IO monad and the IO runtime works.

Actually I/O in general is something that almost always comes from outside the language, so to speak. How do you define a file writing function in C? You have to appeal to some already existing magic syscall which actually cannot be formally defined in the base language. It has no real semantics in the base language, it’s a magic function-like thing that happens to have a “side effect.”

Same with Haskell, only that the base language semantics absolutely forbids functions with side effects, so you have to think of a different way to model it. The IO type is that way. It models effects not as side effects of function application but with these explicitly sequenced command values that are returned from the main function.

It’s just a different formalism. Just like how math equations don’t themselves have any physical effects but can still model physical systems in different ways. Computer languages don’t do I/O, they model and represent, with different semantics and restrictions.

You are right, but this is something that is not usually emphasized. I think I could have been persuaded back then that with enough cleverness you could write printf out of if's and while's :)

Also, some years ago I even thought -- but not consciously -- that you could implement preemptive multitasking just from software (you can do it trivially via emulation, but it's slow and most importantly it's not what happens in actual OSes, where the code you write is the actual code that runs on the CPU). I think when I began reading about assembly and OS development, I was expecting the solution to appear at some point. But again, I was not aware of it (the moment I became, I realized how absurd it was), so I got disappointed in assembly without even knowing why.

According to one of my lecturers, who knows both Haskell and is a Category Theorist, and according to some of my fellow students at the time, monads in functional programming should not be seen as strict versions of the category theoretical concept.

The key concepts for FP are: you are able to reference system state, you are able to leverage composibility better than non-FP languages, you can abstract and prove things around how your program should behave.

I have a copy of Category Theory for Programmers that I haven't read in earnest, but I would think that the section on monads can clarify the main differences between the mathematical monad and the FP monad.

> According to one of my lecturers, who knows both Haskell and is a Category Theorist, and according to some of my fellow students at the time, monads in functional programming should not be seen as strict versions of the category theoretical concept.

Indeed, e.g. lists are monadic, but it is more natural to talk about maps from one list to another, and folding.

But neither is a fundamental monad operation. The natural operation for monads in the example of lists is flatmap, which takes in a function (X -> List Y), and a List X, and returns a List Y. The `map` function is a less powerful sibling of flatmap, and the `fold` operation is more powerful and general than flatmap.

So whereas CT monads distinguish one (admittedly very powerful) notion of composability, monad-like notions in FP may find other notions of composability more natural.

Of course, the main draw of using monads in FP is with the way it quarantines off state and context.

The execution of the main function has side effects. All io functions have side effects that are only observable by main. The print function itself is a side effect. Yes I'm talking about Haskell.

What the io monad does is force the user seperate pure functions and impure functions via type checking.

I think it's sort of incorrect to say Haskell is pure. It's more that only from a certain perspective Haskell is pure.

the way the impossibility is resolved is that the language itself isn't responsible for doing io, but rather it is responsible for (purely) constructing a value which represents an io computation to be performed. this value is constructed by using the bind operator to combine simpler io computation values together. that computation is then actually executed by the runtime system, which is conceptually external to the language itself. The language knows about evaluation of pure expressions, the runtime knows about execution of effectful actions.
You're referring to the axioms defined by the type class.

The actual math definition is pretty deep and requires knowledge and intuition on what a functor is and what a natural transformation is.

Yeah, but I know what those are, I have a math background.
here is a good wiki resource that helped me to understand the use of monads.

https://wiki.haskell.org/IO_inside

I'll post this again "If you are interested in learning functional programming do not learn about monads".

There is no practical justifiable reason to learn about monads unless you already know Haskell. And there is no reason to learn Haskell unless you already know another statically typed functional language.

If you are interested in learning functional programming then either

Learn dynamically typed functional programming and pick Clojure (or Racket).

Or learn statically typed functional programming and learn Elm then F# (or OCaml).

Haskell is the worst language you could choose to learn FP. If you're trying to lean FP, stay away from it. If you're a huge FP advocate don't push it. And finally monads tutorials are the white noise at the end of a wrong way track in learning about functional programming.

Outside of Haskell, monads are about as important in terms of learning, knowing and effectively using functional programming languages and concepts, as braces being on the same line or next line is important to learning about general programming. It's not.

Does that mean no-one can ever mention a monads, no of course not. But it's such a tiny topic, that it's absurd it gets so much discussion like it's some key to learning fp or some incredibly crucial concept. It's not.

I use OCaml, and monads also come up in OCaml code. Monads may not be as essential to understand in OCaml than in Haskell because OCaml doesn't use them to track IO in the type system, but people use them in OCaml to chain options and results. In fact, OCaml 4.08 introduced syntactic sugar that makes functor, applicative, monad usage more convenient. I'd say that monads are something that non-Haskell functional programmers should learn.

Monads also turn up with lists (flatmapping) and promises. Although you don't need to know what a monad is to flatmap a list or use promises, I'd say that being aware that lists have a monad instance or that promises have a monad instance provides insight.

Outside of the terminology and terrible, terribly abstract explanations, I find an understanding of monads to be greatly important.

> Outside of Haskell, monads are about as important in terms of learning, knowing and effectively using functional programming languages and concepts, as braces being on the same line or next line is important to learning about general programming. It's not.

For example, understanding that Promise in JS is a monad helped me figure out how to transform and combine event-triggered functions in a clean manner into promises with shared state, that had a guarantee on execution order.

And every programmer already knows monads once they start to use design patterns in lists, option, promises, io, etc.

> Haskell is the worst language you could choose to learn FP.

Why?

Can't post full thoughts now, but will cover a few points.

Haskell has an extremely steep learning curve. And you have to make it very far up that curve to begin to see or understand any of the value you're getting from FP. Up until then, it's just different and it's hard to see why you're jumping through all of these hoops when you could easily use a non FP language. You have to pay very significant signiticant learning costs before the benefits of any new insights stary to outweight those costs. Elm in contrast you can honestly learn in a weekend and demonstrates 60% of the values a user gets from FP. Very high ROI for expanding your knowledge of programming ideas and approaches.

Haskell evolved as a language to explore programming language ideas in, as a result it is filled with warts. You still need to learn and deal with a lot of those warts to be decently productive, none of those warts are relevant concepts just warts. You also need to be perfect and jump through a lot of hoops with strict adherence to all the rules to get things working. In contrast F# allows you to be non strict in your application of FP concepts. This is actually really beneficial for learners. It allows for separation of concerns - apply the new concept your learning now, and do everything else in the way you're accustomed it. Its roughly equivalent to updating a codebase with gradual typing. Your a lot more likely to be successful if you can add typescript typing one file or area at a time, than if typescript only worked if all of your files or non of your files had types.

Also Clojure (and F#) rely on the JVM and .NET so a lot practical use cases are already solved and solved well. As well, a lot of your existing knowledge will transfer over.

Finally once you've learned those core FP concepts Haskell is significantly easier to learn so you can easily dive into the Haskell unique parts.

There are other thoughts, but this should be a good jist.

Interestingly, I started off with Elixir & Elm and then moved on to Haskell. Elixir & Elm both feel like DSLs with abstraction level purposefully constrained relatively low. A lot of the concepts in those languages didn't make sense until I started picking up Haskell.

I don't really know if it's ultimately worth it or not but as a programmer I want to evolve by raising the abstraction level in my code. Haskell is the perfect platform for that.

Btw any examples of those warts in Haskell?

I found the article the author links to, which explains functors, applicatives, and monads in pictures, to be way more helpful: http://adit.io/posts/2013-04-17-functors,_applicatives,_and_...
Yeah, interesting to learn that a monad is a value in a box, almost as if it were an object
It isn’t though. A monad is just an abstract mathematical object that satisfies a few axioms. It doesn’t have anything to do with implementations or computations, let alone objects in the OOP sense.

It’s better to think of it like this: a value in a box is one example of a data type that can satisfy the monad axioms. Another example is a delayed computation that produces a value when executed. Yet another is a delayed computation which produces no value at all but causes a message to be printed to the screen.

The problem is, of course, that a monad is not a value in a box. That's a misunderstanding, resulting from an oversimplified metaphor. This shows a frequent problem with metaphors.

See "the fallacy of monad tutorials": https://byorgey.wordpress.com/2009/01/12/abstraction-intuiti...

Oh for the day when monad tutorials and explanations don't outnumber useful Haskell programs with a purpose other than programming a computer by many orders of magnitude.

The list of such programs hasn't grown much since the last time I noticed that we did this 2-3 years ago has it? Git annexe, ion, pandoc, and..? There were a couple more than that, i think.

“There are only two kinds of languages: the ones people complain about and the ones nobody uses.”

Except for Haskell which happens to be both!

Are you talking useful for us as developers, or Haskell being used in production? Because there are plenty of examples of Haskell being used in production.
Postgrest is a pretty cool one. And arguably Elm could be included. That's getting more widely used
There's more Haskell-based software out there than the stuff that's open source or advertised as being Made With Haskell
I think Xmonad is the biggest one.
> We simply cannot think about such abstract concepts without introducing some sort of metaphor.

Metaphors are overrated. Often they obscures more than they explain, especially for such an abstract concept.

What if you had to explain "statements" with a metaphor? Or "expression"? I challenge anyone to come up with metaphors which help more than they confuse.

We learn programming concepts through examples, and seeing how they are useful. Not through metaphors.

Metaphors are a very powerful tool in human cognition, and there has been a lot of research into their usage from a cognitive science perspective (see e.g. the works by George Lakoff).

Of course, an abstract mathematical proof is independent of any metaphors you may ascribe to them, but new maths is usually discovered by applying intuition to a problem and then verifying that the intuition is in fact correct. To do that, you need metaphors and analogies. The interesting part is when you can use several metaphors for the same concept and the key insight comes from a shift in perspective; e.g. some statements about complex numbers make more sense when viewed from a geometric (rotation-inspired) perspective. Similarly, thinking about e.g. monads as "containers that can be flattened" can lead to certain insights, but thinking about them in some other ways can maybe lead to different insights.

However, I agree with you in one point: we learn mathematics and programming by example; I think the key point is that people have to build their own intuitions and metaphors. It can be useful to guide people along that path, e.g. by pointing out some helpful analogies or dismissing some unhelpful ones, but ultimately, you have to start working with the concept and prove/disprove your own assumptions (this is as true of a new programming concept you're not familiar with yet as of mathematics, only that in the former case you're much less formal about it).

But usually nobody cares about or checks whether your implementation of the operator your overriding still obeys all the axioms. Sometimes we have tests whether some interface implementation fulfills some axioms, but usually we do not.

Somehow with monads we do. And I suspect the reason is, that monads come from an area where of Programming where proving a program correct is preferred to testing. And for proving correctness, you need a proper specification and then implementations that follow that specification. And a specification is done using abstract concepts.

If you want to prove anything about your program (or even understand obscure edge cases), concepts through examples will not work for you.

> But usually nobody cares about or checks whether your implementation of the operator your overriding still obeys all the axioms.

Make some non-associative implementation of (+) and see on how many compilers code using it will work.

"statement" and "expression" in their programming sense are already metaphors.
Well the words are taken from math but means something different in programming. A statement in math have a truth value, so it would actually correspond to an expression rather than a statement in programming.

Saying that a statement in programming is similar to a statement in math (or law or politics or whatever) would be more confusing than informative.

The words are taken from, you know, words. Imprecision can sometimes be more confusing than informative. Are you saying (with a simile?) that the author should have said analogy rather than metaphor?
No, I'm saying the use of metaphors to explain programming concepts is often more confusing than helpful.
Bartosz Milewskis Category Theory for Programmers also has nice explanations for all these concepts. Haven't read through everything, but what I've read has been very high quality.

https://bartoszmilewski.com/2014/10/28/category-theory-for-p...

His video lecture series is also well-paced.

https://www.youtube.com/user/DrBartosz/videos

Milewski's explanations are among the best of what you can find on the net. Much better than OP anyway.
Here's an interesting article about the "Fallacy of Monad Tutorials" and why most metaphors fail: https://byorgey.wordpress.com/2009/01/12/abstraction-intuiti...

The key insight here is: there are no shortcuts. Most metaphors are useful but limited. Once a metaphor "clicks" for a person, they forget all the work it took them for the concept to really click, and think the shortcut will help other people: "Monads are like burritos! If only other people understood this, they would get them!".

> "But now Joe goes and writes a monad tutorial called “Monads are Burritos,” under the well-intentioned but mistaken assumption that if other people read his magical insight, learning about monads will be a snap for them. “Monads are easy,” Joe writes. “Think of them as burritos.” Joe hides all the actual details about types and such because those are scary, and people will learn better if they can avoid all that difficult and confusing stuff. Of course, exactly the opposite is true, and all Joe has done is make it harder for people to learn about monads, because now they have to spend a week thinking that monads are burritos and getting utterly confused, and then a week trying to forget about the burrito analogy, before they can actually get down to the business of learning about monads."

Already in this comments section I see this kind of misunderstandings: "monads are containers with values in them, almost like objects", "monads are for effectful computations", "monads are for imposing a sequential order", etc. These are all particular uses of monads, but not what monads are.

https://xtendo.org/monad

This is the best introduction to the topic I've seen for Haskell beginners. It provides some much needed context.

That's a great presentation that every Haskell beginner should see.

As an extra it has some insights on the failure modes of (monad) tutorials, like "Using things you don't know in order to teach you something you don't know".

This isn't an useful article about monads in my opinion. It adds neither insight nor depth, is completely redundant given the hundreds of articles out there, and is unlikely to give anyone struggling with monads an insight that would help them. Particularly egregious is the fact it introduces a formal definition only to immediately ditch it because it's "probably pretty incomprehensible" -- and this in 2-minute read with no room to spare.

Besides, everybody knows monads are not like containers. They are like burritos! https://blog.plover.com/prog/burritos.html

I think in order to understand Monads you should try to invent them yourself. Try to create a pure IO function in the programming language you know best. Consider the state of the world as input parameter of your function and return a modified version of the worlds state. Then try to compose IO functions and try to write an interface for IO function composition.
This was the first time I've seen the "A monad over a category C is a triple (T,η,µ)" in a definition. That confirmed some mental models that I was a little unsure about. I'm new to the fp scene; still trying to get a good foundational understanding.
If you're new to functional programming, you don't need to worry about the category theory definition. Especially since this tutorial simply mentions it as a way to say, "this is too complicated for us, so we're going to explain it another way," and makes no attempt to break down the definition as code. In fact, be careful that you're not seeing category theory definition and misunderstanding it to confirm a mistaken intuition.

The category-theoretic definition basically states that a monad is an endofunctor T equipped with two operations, return :: a -> T a and join :: T (T a) -> T a. T being a endofunctor just means that it is "mappable," AKA it has a function fmap :: (a -> b) -> T a -> T b that "lifts" functions into the type.

Monads capture the idea of flattening. If fmap changes the "contents" while preserving the "shape," join collapses nested layers to change the "shape."

Instead of join, the Haskell typeclasses (and this article) uses another function called bind, or >>=. You can derive join from >>=, and you can derive >>= from join and fmap. For lists, a more familiar name for >>= might be flatmap.

The visualisation linked to at the end goes into some more detail and was definitely worth the extra 5 minute read for me!
A monad certainly is not a container. None of the functional patterns are containers.
TIL that the domain extension ".christmas" exists. I wonder why though...
Capitalism
I've always struggled to grok the concept of monads. Worth the 2 minute read.
Except this “simple” metaphor isn’t too useful. For monads like IO and State, “monad as a container” is a stretch. And some don’t fit at all — the list monad often is imagined as a computational context with nondeterminism, not simply a container.

https://byorgey.wordpress.com/2009/01/12/abstraction-intuiti...

I don't really understand your point of view. Especially the list monad is literally a list. The shape of bind for the list monad is [a] -> (a -> [b]) -> [b]. If that's not a container, I don't know what is. The fact that it can be used to solve problems where the list represents a non-deterministic result is really secondary. I mean, I suppose you can think of it as being different, but it seems a lot more complicated in my mind.

I think some "containers" a definitely hard to envision. A partially applied function is also a monad (i.e. it's trivial to write a meaningful bind for it). It may be hard to think of a partially applied function as "containing" the applied parameters, but I still think that's easier than any other way of envisioning it. But maybe it's a matter of horses for courses.

For IO, the monad is a "box" you put the entire mutable universe in so you can pretend the rest of your program is stateless.
A monad is for composing computations with effects: let's say you have some function f:: a -> b returning some b, but now you need the computation to happen with some extra effect (throwing exception, changing state, futures, etc).

A way to model effects is to have f :: a -> Tb, where T is a type constructor encapsulating a value of type b produced with some effect. Let's call this new f a T-program, then a monad is exactly what you need to compose T-programs nicely.

The monad "return" is the simplest T-program, and the monad "bind" helps you compose T-programs: given another T-program g :: b -> Tc, you cannot simply compose g with f since the types don't match. However, the bind operator >>= :: (b -> Tc) -> Tb -> Tc gives you a function (>>= g) :: Tb -> Tc which does compose with f, producing a new T-program: a -> Tc. So now you can compose T-programs, yay! The monad laws ensure that T-programs form a category, which is to say that composition works the right way.

So a monad is just the plumbing you need to compose computations with effects in a nice way.

> A monad is for composing computations with effects

To nitpick: that's only one particular use of some monads.

It's pretty close to how both Moggi and Wadler formulated it in their respective seminal papers which introduced monads for representing different notions of computations in functional languages.
Well of course, but arguably there's not much space between "monads-as-containers/burritos" and "monoids-in-the-category-of-endofunctors" presentations...

If you know a better one in between I'm all ears :)

Another way to say it: a monad is a generic type with the sufficient API to do binding sequences like

    x <- fetch "foo"
    y <- frob x
    return (x + y)
The meaning of binding/sequencing is decided by the particular monad instance. This is why it's a useful formalism to represent things like asynchronous I/O (you make the sequencing mean promise chaining), abortable computations (you make the sequencing cancel when it sees failure values), combinatorial/nondeterministic programming (you make it so one binding can happen several times).

Monads are also closely related to continuation-passing style or delimited continuation capture, and those techniques can also be used to implement everything monads can.

I always see monad tutorials like this conflate (1) "understanding using particular monads" with (2) "understanding using monads in general", which are completely different things. Luckily it's easy to not confuse those two with (3) "understanding monads abstractly" which tends to break out the category theory, but some tutorials like jumping on that too.

To demonstrate the differences, suppose now that you're deep diving into a codebase which depends on a Async monad. Your eyes light up and you go aha(!) that means async programming, I've done that before! The point is, you saw "Async", you know it means "make some task's outcome depend on a previous task", which is the behavior of bind for that particular monad. Without (1) the word "monad" is really unnecessary. Why not just refer to bind as some arbitrary async-API function? In summary, you don't need (1), (2), or (3) for this! Monad is just a word.

Okay, when do we need (1)? Let's say you've changed companies, and now you're plugged into another codebase and oh(!) there's that word "monad" again, but this time it's the "Future" monad. Another whole API to learn... but soon you realize that it's again the same old "make some task's outcome depend on a previous task". Except they just call it "Future", and instead of "bind", they say "flatMap" which is the same thing! And you bump into it again and again with different names -- Deferred, Aff, Task -- but in the end it's the same behavior and API. How upsetting that they don't call them all Async and be done with it! But if you have (1) you'll know they are just the same monad, given different names. That's (1): familiarity with the usage of a particular monad. Later on you might collect understanding of the Option/Maybe/Nullable monad, or the Result/Either/Error monad, or the List/Stream/Nondeterministic monad, so now names don't bother you because you know CONCEPTS, but all of this is still (1) if each monad is being treated as an independent API to grok.

That's where (2) comes in. Suppose now that you're deep diving into a codebase which depends on the "Anisotrope" monad. You've never heard of this before[0], but you know monads in general i.e. (2), so you know you can make functions whose output (an Ansiotrope object) depends on some property of a Ansiotrope object, and that all the details about this dependency can be found in the definition of bind for this particular monad. In other words you can use knowledge of (2) to quickly adapt to new monads without having to regard them as a completely new concept. It lets you answer questions like "what's this Probability monad supposed to do? what about this Forkable monad?" in a general way. Conversely, you can now start writing monads of your own by thinking about what bind could mean for your new "Transmogrifying" monad.

Is this useful? Imagine learning all OOP patterns knowing that every pattern is the same concept except for this one thing: bind. That means it's relatively trivial to learn these patterns and make new ones. Except they're not OOP patterns, it's monads, and no, you can't generalize OOP patterns the same way.

Anyways, this is also where "monad is a box"-esque metaphors fall short. It's rather descriptive of particular monads, but doesn't generalize well when you start reaching for many others: Reader(environment-passing), List(nondeterministic computations), Probability(probabilistic programming), or fun ones like parser monads, Amb(backtracking), Tardis(state-access-timeline-specifying)[1].

[0] I hope not, since I made it up. If you have (3) you're probably deriving a working definition already. Have fun! [1] http://hackage.haskell.org/package/tardis-0.4.1.0/docs/Contr...

I really like the distinctions you are drawing. And I think that it's better for people to walk through this in (1), (2), (3) order rather than jumping to (2) or (3) before they know (1) on at least one monad.

> Imagine learning all OOP patterns knowing that every pattern is the same concept except for this one thing: bind. That means it's relatively trivial to learn these patterns and make new ones. Except they're not OOP patterns, it's monads, and no, you can't generalize OOP patterns the same way.

From the OOP perspective, though, monad is a pattern where you only have one behavior (bind) that you can customize, and you have to do everything through that one behavior. That looks like a straightjacket. Sure, it composes well, but... why limit yourself like that?

Thanks!

> From the OOP perspective, though, monad is a pattern where you only have one behavior (bind) that you can customize, and you have to do everything through that one behavior. That looks like a straightjacket. Sure, it composes well, but... why limit yourself like that?

That's an excellent question. Thinking of monad as a pattern itself, with bind as its API, made me realize that OOP patterns (as a categorical concept) aren't the best analogy for the different behaviors monads encode, but rather monad is just one of the patterns. Makes sense -- there are obviously behaviors not generalized by monads but are in the realm of OOP patterns. For instance, the visitor pattern is covered by functors.

(At the time of writing I was thinking about how bind can relate seemingly disparate behaviors: error handling, async, state, etc. What other useful structures seem disparate? OOP patterns! And so I fell into the black pits of "Monads are just like <thing>" myself, where thing = OOP patterns.)

As for the question: One could call it a straightjacket but one could also call it "the law". Limiting yourself is exactly how Straightjacketed Sam can ensure correctness properties of the program. Monads force Sam to work within the monad, so they can only write "a depend on b" in a specific and well-defined way. This is analogous to how singleton pattern would force Sam to work with this one instance of an object, or how the factory/smart constructor pattern forces ensures only certain (valid) objects can exist. If any of these restrictions are too limiting, there's no need to use the patterns, but you'll also lose the laws (guarantees)!

Lose the straightjacket, lose the guarantees. That makes sense. I had been thinking of it as just a straightjacket.

And, as you point out, there are some OO patterns that we can also provide useful guarantees and constraints.

Didn't even know there existed a .christmas domain.
Only accessible while Santa Claus is visible in the night sky.
This was a little sparse but it was a fun read.
> A monad is a concept that belongs to a branch of mathematics called category theory, where it was introduced in the 1960s.

No. [1]

[1] General Theory of Natural Equivalences, Mac Lane and Eilenberg, 1945.

https://www.ams.org/journals/tran/1945-058-00/S0002-9947-194...

Sorry, I misread; monads were only introduced later. The above paper introduced the field of category theory itself.