Well, you see, that's one of the problems... monad implementations don't have to be "containers", or at least not the way most people mean. This was one of the critical errors in many of the aforementioned "tutorials". IO, the quintessential monad, is not a container, for instance.
(A nearly-exact parallel can be seen in the Iterator interface. You can describe it as "a thing that walks through a container presenting the items in order"... and yeah, that's the majority use case and where the idea came from... but it's also wrong. What it really is is just "a thing that presents items in some order". It doesn't have to be from "a container". You can have an iterator that produces integers in order, or strings in lexigraphic order, or yields bytes from a socket as they come in, or other things that have no "container" anywhere to be found. If you have "from a container" in your mental model then those things are confusing; if you understand it simply as "presenting items in order" then having an iterator that just yields integers makes perfect sense. A lot of the Monad confusion comes from adding extra clauses to what it is. Though by no means all of it.)
The problem is telling people it's a container is "over describing" it. We don't need to hypothesize about that. We have the space suits and burritos to prove it is not a good didactic approach. It is not removing from the definition to simplify, it is adding to the definition, exactly as I carefully showed in my description of "Iterator". An Iterator is "a thing that presents a series of items". It does not simplify the discussion of Iterator to say "It's a thing that presents a series of items out of a container, but also, it doesn't have to be a container". It's not the first definition that's "overdescribing", it's the second.
Part of why Haskell appears like such an implacable curmudgeon is the predilection of its community to believe that users must grasp type and logic theory to use it.
They don't.
Just like they don't need to have a mental model of their computer to write software for it.
This has inspired me to try to update my post on the idea in a side window, but it's been sitting on my hard drive for over a year now and probably still has a ways to go yet.
(A nearly-exact parallel can be seen in the Iterator interface. You can describe it as "a thing that walks through a container presenting the items in order"... and yeah, that's the majority use case and where the idea came from... but it's also wrong. What it really is is just "a thing that presents items in some order". It doesn't have to be from "a container". You can have an iterator that produces integers in order, or strings in lexigraphic order, or yields bytes from a socket as they come in, or other things that have no "container" anywhere to be found. If you have "from a container" in your mental model then those things are confusing; if you understand it simply as "presenting items in order" then having an iterator that just yields integers makes perfect sense. A lot of the Monad confusion comes from adding extra clauses to what it is. Though by no means all of it.)