Hacker News new | ask | show | jobs
by jtsuken 2083 days ago
"Don't trust your intuition". This should be the basis for all teaching in statistics and probability. If all goes wrong, it should be the one thing everyone remembers from their statistics education. And yet year after year, everyone is starting with E(X)=sum(x*P(x)) and has no idea what it was about afterwards.

With calculus and linear algebra your gut feel is about right no average. You can quickly get a feel for trajectories, acceleration and distances (derivatives and integrals), areas, volumes, amounts, etc. But on probability your gut-feel will always fool you.

In the end, you see a handful of math bloggers bemoaning the lack of education in probability and the nonsense being discussed by journalists and politicians. And it hardly matters whether it's an election or a pandemic. The lack of understanding of uncertainty and the false belief that one can reason about these without looking at the numbers too closely is dangerous.

Sorry about the rant. But...

Dear creator of seeing-theory.brown.edu, if there is one thing you could change about the project to make it different and infinitely more useful: Please start the first chapter with the goat problem[1], then go through a couple of examples from chapter 10 in Thinking Fast and Slow[2], the discuss information (maybe with a simplified version of Mendel's pea experiment[3]), discuss distributions and leave expectations and variances for much-much later.

[1]: https://en.wikipedia.org/wiki/Monty_Hall_problem [2]: https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow [3]: https://www.sciencelearn.org.nz/resources/1999-mendel-s-expe...

9 comments

On the topic of the Monty Hall problem, what helped me "believe" it more was if you change it to 1,000,000 doors, still with only 1 car, and the rest goats. You choose 1 door. The host then opens up 999,998 other doors, which all contain goats. So there are 2 doors left. Your door, and the only other door the host didn't open. Do you feel at a gut level that you should switch?
I see this argument a lot and for some reason it doesn't help me with the intuition at all. If you (wrongly) get caught up on the fact that the remaining door and your pick have the same initial probabilities of being a car, then you'll still think that switching doesn't make a difference even in the million-door case.

Here's what works for me:

- the switching strategy always gives you the opposite of your initial choice

- the initial probabilities are 2/3 goat and 1/3 car so by switching you get 2/3 car and 1/3 goat

When Monty opens doors he uses 2 pieces of information: the door you picked and the correct door.

After he opens 999,998 doors he has given you quite a bit of information. There is a 1/1000000 chance though that he has given you no information (you picked the correct door)

But you're right that thinking about it in partitions also makes sense. You try to pick a partition size 1 that contains the prize, while Monty picks the partition size 999,999, if you agree with his partition and it has the prize you get it

I think the point a lot of people miss is that the trick to understanding the 3-door question and the 1000000-door question is the same trick. If you don't grok the trick, the 1000000-door example might make it easier to grok, but there's a fair chance it won't.

To muddy the waters further, it's not always understood that in the 1000000-door case, 999998 other doors are opened (as evidenced by discussion elsewhere in these threads). Sometimes people think it's still just one door. I suspect this is because the original problem is usually stated as "...Monty Hall then opens one of the doors you didn't pick" and because people suggesting the 1000000-door often just say "...what if there were one million doors?"

Rather than going through this convoluted kinda similar problem, I find it easier to stick to the original one.

Get a piece of paper. Draw all possible outcomes, 9 total. ( Car is behind door 1 you pick 1, Car is behind door 1 you pick 2...). 3 of the 9 result in success.

Now draw the outcomes again but switch every time. 6 out of 9 outcomes are a success.

What gave me the intuition for it was to realise that Monty is giving you the choice to either stick with your original door, or take the sum of the prizes between the other two doors.

(He also opens one of those two doors to reveal a goat, but you already knew that one of them had a goat so that doesn't give you any additional information.)

Many people have suggested this "intuitive" explanation. But it's not at all clear or intuitive that jumping from 3 to 1,000,000 doors should lead the host to open 999,998 other doors rather than 1 other door.
"But it's not at all clear or intuitive that jumping from 3 to 1,000,000 doors should lead the host to open 999,998 other doors rather than 1 other door."

It SHOULD be clear, because you have two givens: 1) Monty never reveals the car. 2) He opens all the doors except 1.

"2) He opens all the doors except 1"

How is this a given exactly? In the original problem he only opens 1 other door. Now that also happens to be all doors except 1, but from just the 3 door problem that seems more coincidental than a fundamental part to the question

It's a given by the person who mentioned 999,998 doors. I think you're missing the point, but I won't pursue this further.
Sorry, but you are the one missing the point. The person who mention 999,998 doors didn't give any reasoning for why that would be the logical extension of the problem.

Obviously, you and I know it is, but the person grappling with the Monty Hall problem is right in not being convinced of that just because someone says it is!

pyhtel, I gave it a go at explaining the rationale in this comment I made below: https://news.ycombinator.com/item?id=24643272
em500, I gave it a go at explaining the rationale in this comment I made below: https://news.ycombinator.com/item?id=24643272
But this raises a different problem with intuition:

If Monty doesn't know where the car is, then if 999,998 doors were opened showing goats, leaving two doors, the odds that the car is behind your door or behind the remaining door is 1:1 ... this defies many people's intuition.

The difference between the two cases is that, if Monty knows where the car is, then his opening 999,998 doors with goats behind them is exactly what we expect, whereas if he doesn't know where the car is, then his opening 999,998 doors with goats behind them is an extraordinarily unlikely event. But if that does happen despite being extraordinarily unlikely, then there's still a 50% chance that the car is behind your door.

My intuition is that:

1) I will probably lose when Monty opens a car door. 2) If I don't, I am really gambling between whether I made a 1-in-a-million pick or Monty did (in the choice of which door to leave shut), which obviously has even odds.

Interestingly, by compressing this problem back down to the 3-door version, it makes it pretty obvious why that's the case (and aligns with people's intuition about the original problem). Also interesting that in this case, even if the intuition is wrong (that 'obviously' they must have picked the car), the outcome (sticking with the chosen door) is an optimal strategy.

What's unintuitive about the Monty Hall problem is the difference between mathematical Monty and a psychological Monty. It is easy to imagine Monty almost only opening doors in case the guest choose the car, and not opening anything in case they choose a goat. So, when presented with the choice, a cautious guest will hesitate to change. If, however, the guest knows from previous shows that Monty will ALWAYS open a goat door, it is still mentally hard to change the cautious strategy.
The mathematics behind probability and statistics is about as ripe for intuition as calculus and linear algebra. A lot of it really comes down to counting in probability (calculus/measure theory for the continuous case) and quantifying properties about probability distributions for statistics.

The really hard part is the modelling part, where you transform the problem to a mathematical statement and vice versa. It's very easy to misinterpret both the problem in terms of mathematics and the mathematical result in terms of the problem. All the wrong answers to brain teasers like the monty hall problem, the tuesday boy problem etc., are right answers to the wrong question.

Unfortunately, in education we do not seem to want to discuss the modelling part on equal terms with the theory. We seem to be okay with solving the entire problem, or solving just the theoretical part with no regards to the application, but expressing just the mathematical problem to be solved is never appreciated. In a calculus setting, this could be deriving the answer to some physical problem depends on the solution of some partial differential equation -- even if you do not have the tools to solve it outright.

My guess is that it's just easier to teach theory with clear cut answers. Modelling the real world is ambiguous and hard.

It's because math courses are crazy expensive and teaching both modelling and theory together would mean taking way longer (read costing way more) or making the failure rate (cost) way higher.
Wow, strong disagree. Once you develop intuition, probability is really quite intuitive. This kind of course should be working to develop this intuition — like the conditional probability examples and the CLT examples. The computational examples inline really help here.

The Monte Hall problem is more of a curiosity than a fundamental principle!

(Was a TA in undergrad engineering probability for 2 years, saw my share of learners.)

I can't argue with "once you develop intuition, probability is intuitive". I was arguing that lessons starting with E(X)=... basically stop the majority of people from getting to the point, where they see how their "initial intuition" is wrong.

Convincing as many people as possible that statistical intuition is not something we are born with should be the key priority of any probability and statistics class.

Monte Hall was one example. The birthday problem and the base rate fallacy are two more [1][2]. The result seems obvious but most people get these wrong.

With a couple of papers or books by Kahneman and Tversky in hand we can generate an almost infinite list of simple statistics/probability questions, which most people get wrong. Let people make some mistakes, before dumping the theory on them.

[1]https://en.wikipedia.org/wiki/Base_rate_fallacy [2]https://en.wikipedia.org/wiki/Birthday_problem

> Monte Hall was one example.

Monty Hall is not a good example, unless it is explicitly stated that Monty knows where the car is and that he deliberately opens a door with a goat. Just look at the discussions in the comment here.

> Convincing as many people as possible that statistical intuition is not something we are born with should be the key priority of any probability and statistics class.

Again, strong disagree. Probability has been understood at a quantitative level since Laplace (1812). Modern measure-theoretic probability dates from Kolmogorov's foundational work (1933). All these years later, we really know this stuff.

Specifically: A lot of general-purpose, powerful tools have been developed. Distribution theory, the strong law of large numbers, the CLT, maximum likelihood, L2 theory for estimation.

Depending on your goals, these or related tools are capable of addressing a wide range of problems. The priority of the first few courses should be to impart mastery of a selection of these general-purpose tools, so that students know how to analyze problems probabilistically. This is where intuition comes from.

Gotcha problems like Monte Hall are not getting you to this goal!

One could argue that MHP can motivate the notion of conditioning, but I think fundamentally the MHP is verbal legerdemain. That is, you state the problem such that the conditioning is implicit in the actions, and people don't notice it. Recall that the questioner obtains "victory" when, after presenting the problem, the answerer is confused and gives the wrong answer. I don't like that approach as a teaching tool.

I'm also skeptical of the Birthday Problem and the Kahneman-Tversky surprises. I see value in these surprising conundrums (the Birthday Problem is in volume 1 of Feller, so it has a pedigree) only to the extent that they motivate the utility of general-purpose analytic tools. They are an appetizer, not the main dish.

> Probability has been understood at a quantitative level since Laplace [...] and Kolmogorov.

Which indicates it is roughly as hard as partial differential equations, the theory of relativity and just a tiny bit easier than some of the quantum mechanics.

This is pretty unintuitive for a subject, which mostly relies on multiplication and addition.

The dozen or so posts discussing the intuition of the Monty Hall Problem are a case in point.

> They are an appetizer, not the main dish.

This is certainly true.

"Once you develop intuition, probability is really quite intuitive"

That's a tautology.

Plenty of studies, such as the work by Kahneman and Tversky, show that humans by default have incorrect statistical intuitions. These faulty intuitions are hard to overcome, even by a considerable amount of training.

> The Monte Hall problem is more of a curiosity than a fundamental principle!

It's quite straightforward conditional probability. That so many people, including trained mathematicians, get it wrong is quite illustrative. And it's not unique ... the coins and drawers problem is similar, and one can craft many others. MH is not a mere curiosity, it's simply well known.

> It's quite straightforward ...

No it is not, unless it is explicitly stated that Monty knows where the car is and that he deliberately opens a door with a goat. Just look at the discussions in the comment here.

> unless it is explicitly stated that Monty knows where the car is and that he deliberately opens a door with a goat.

That has been part of the explicit problem ever since it was first presented back in 1975.

"Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, 'Do you want to pick door No. 2?' Is it to your advantage to switch your choice?"

It does not clearly say that it is Monty’s procedure to

- allways open a door

- allways open a door with a goat

- and not open a door at random

Often discussion of the solution reveals that this is not clear.

There are three doors. You have picked one, leaving two other doors. It absolutely explicitly says he opens one door. The only options are a goat or a car. If it was a car, you would have lost already and so there is no problem. If it was random, you still get the same information (what is behind one of the unpicked doors).
This is goalpost moving that has nothing to do with the original point. If people misunderstand the conditions of the problem, that has nothing to do with intuitions about probability.

I won't respond further.

It is a tautology, but we are studying teaching, so maybe that's not unexpected? ;-)

My point is that the goal of the course should be to understand principles, not to teach people that their existing intuition is faulty. Who cares about their prior condition of ignorance?

For more, see my reply nearby.

I'm always wondering the answer to this question:

Why does multiplication coupled with some sort of integral calculus work the way it does? We multiply to get moments of a distribution, we multiply to convolve, we multiply to get the work done on an object. I suppose the answer is multiplication allows us to scale some function f(x) with some function g(x). But I guess I want something deeper and I feel like I'm missing it.

We are very good at finding correlations. It is still very hard to prove causality in natural phenomena from experiments, specially when we cannot control them. This became blatantly obvious in the covid outbreak where nobody had a clue for months about whether masks would help or not. Edit to clarify: It is very hard to prove to causality and be sure that you did not mess up.
You are right about people confusing causality and correlation. Otherwise this site wouldn't be so funny: https://www.tylervigen.com/spurious-correlations

You are wrong about people being good at finding correlations. I rarely met people who can process a sufficiently large sample size in their memory to calculate any significant correlation results. Whereas guessing correlations from charts exposes you to a number of optical illusions, which will fool the brain into seeing things that don't exist.

There may be a propensity to make more type 2 errors and see correlations between any random things such as 5G and COVID, but I haven't seen any research on that.

> But on probability your gut-feel will always fool you.

Yes, this is especially true when first learning; however, one can still develop intuition so that it serves as an invaluable motivator and guide through difficult problems.

My professor for statistics (he was quite famous in the field) talked about Monty Hall, but made clear that he will not give a solution because of science-political reasons.
I am incredibly curious what he meant by "science-political reasons."
Bringing up Monty Hall at a table full of tech people has always resulted in an argument that will not end until one person gets the rest of us to admit we are wrong and changing doors is the same probability as staying with the same door.
Me too, but he refused to explain. I think there must have been a time where choosing a side was able to end friendships.
Why are you calling the Monty Hall problem for the goat problem?

It is known in academic circles as Monty hall and when it pops up in popular media, it is also referred to as Monty hall.

Thanks for pointing this out. While Google and Wikipedia confidently redirect me to the Monty Hall article, which does mention a goat at some point, the common name for it in English is "Monty Hall Problem". In other languages it's a three-door-problem or the goat problem, but given that it's such a good example for so many things in psychology and maths, I should be using the most common name in each language.
From Wikipedia[0]

... It became famous as a question from a reader's letter quoted in Marilyn vos Savant's "Ask Marilyn" column in Parade magazine in 1990 (vos Savant 1990a):

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

[0] https://en.wikipedia.org/wiki/Monty_Hall_problem

"The law of Small Numbers" chapter 10?