Hacker News new | ask | show | jobs
by mlechha 2797 days ago
Whenever this comes up, I think about the conjunction fallacy https://en.m.wikipedia.org/wiki/Conjunction_fallacy. The observation that human subjects seem to assign higher probability to joint events than a single event. Which is weird because the probability of two events at the same time (conjunction) is always less than or equal to the probability of a single event on its own.

How does the Bayesian brain hypothesis deal with this fallacy? It seems to me that nothing based on classical probability can explain this fallacy. So either the observation that humans can assign higher probability to joint events is wrong or human decision making isn't exactly probabilistic (in the classical sense, can't rule out exotic probabilistic approaches).

EDIT: As several folks have commented that the conjunction fallacy can be explained away by different arguments based on interpretation and semantic issues. Indeed, the original Linda problem was susceptible to these issues. However, since then several researchers have tried to study this effect more carefully and it seems to still persist. An example that I'm aware of is the following https://link.springer.com/article/10.3758/BF03195280 where the authors used unambiguous language and a betting paradigm, but still found the effect. Again, this is most likely not fool proof. Regardless, I do not think the fallacy can be trivially explained away as an effect of ambiguous language.

7 comments

Regarding the Conjunction fallacy my theory is that many people who are not used to math problems implicitly modify the question based on the context.

Instead of choosing from options:

1) Linda is A

2) Linda is A and B

they might actually understand the first statement in the context of the second statement as:

1) Linda is A and not B

FWIW, there was a similar experiment in the form of a dice game [1], and the phenomenon was still present.

[1] https://www.lesswrong.com/posts/QAK43nNCTQQycAcYe/conjunctio...

No, that's not what's going on. The reason for the fallacy is that we tend to find more detailed stories more convincing than less detailed stories.

However, not everybody, you can be trained against that. I've heard that police interrogators are less prone to this fallacy because they know that liars who had time to prepare often add dozens of details to their story that no ordinary person would remember.

> 'fallacy is that we tend to find more detailed stories more convincing'

Anecdotic: Somebody asked: "Why do I feel often that angry, when I get a 'Typ5-Answer'?" (With the background an TYp5-Answer is located in the field of the manipulation of (someones) reality.

HINT: Typ1-Answer: labeling of 'Objects' / Type2-Answer: naming of coherencies / Ty ....(-;

Good point. But... this particular objection has a flaw (which may be copied by the BBH).

The Bayesian machinery tells you how to update your beliefs given evidence. It doesn't tell you what shape those beliefs should be in the first place.

My theory is that we carry around a deck of personas or sterotypes. We hear of a new person, and some behaviour of theirs. We then predict which persona was likely to generate that behaviour. Conditional on the distribution over personas we answer the questions about that person's predicted behaviour.

In the `Linda` example the theory above suggests we take the background information about all their political commitments at college, which from the description would seem to give strong evidence about their persona.

The common and wrong answer to the question stems from predicting, based on the persona, that the person would continue with their political commitments.

The 'shape being wrong' issue here is that maybe personas are not the right way to structure the problem. But Bayes's theorem doesn't tell you that. That's a whole load of extra machinery that people have additionally developed and should deploy when using Bayes's theorem.

Back to the word problem. An issue is that the options:

- A

- A&B

Both include A. In a world where A always happens the only non-trivial way to read the question is that the first option must implicitly mean (A&!B).

If A is assumed to be true, the question then becomes is B more likely to be true or not true. The background information given is a reasonable explanation for the common answer of people selecting (A&B).

Solving epistemology doesn't solve ontology. Bingo.
In more strictly mathematical terms, having a "normative" update rule (Bayes' rule) doesn't tell you what topology of latent variables the generative model "ought" to have, only how to link new information into a preexisting generative model.

Using the KL divergence of the posterior predictive distribution as a target to optimize does a bit better, but still isn't a "solution".

Seriously, what doeos "topology of latent variables" even mean?

A topology on U is a system of subsets of U that's closed under union and finite intersections and containing the empty set and U itself. Go!

>Seriously, what doeos "topology of latent variables" even mean?

The simple answer is: the graph topology of the resulting program traces, equivalent to the topology of a graphical model sampled from a distribution over graphical models. The complicated answer is: the Scott topology of the program-trace space.

This is a fallacy of interpretation (translating verbal representations of events into mental representations of events), not of probabilistic reasoning.

The crux I think is that when

A. bank teller

is juxtaposed with

B. bank teller and feminist,

we vaguely and falsely interpret A as "bank teller and not feminist", while the correct explicit interpretation is "bank teller and possibly feminist, but possibly not".

Just to add something that I found interesting when my logic professor said it at the time: while that “bank teller and not feminist” interpretation is strictly incorrect in a theoretical world, IRL it’s useful for humans to just assume pieces of information. Most people are not particularly feminist, so it’s probably safe to assume someone - especially a bank teller, maybe - isn’t feminist if it’s not mentioned.

Rather than a fallacy of some kind, people may actually be so good at correct inferences that they are bad at leaving that intuition out of their reasoning process when thinking about a weird, outlying theoretical world.

It doesn't seem surprising at all that we fail to understand abstract questions like this. The experts only learned to answer them correctly after millennia worth of mathematical development!

To study how accurately Bayesian some animals are, I think you need to find ways to pose them questions which are relevant to them, and read off their inferences from their behaviour. This is obviously harder to do, and you have to worry a great deal about the animal optimising for something that differs from your first guess (e.g. it doesn't want maximum food on a good day, it wants not to starve on a bad day). I don't have references to hand but I think that when we can do this, the results are quite good.

That's still a fallacy, even if the mechanism behind is successful on average. Assuming a bank teller is non-feminist is rational only if the feminism matters, which it does not. Even if 0 bank tellers are feminist, it's still unhelpful (and harmful is even 1 bank teller is feminist) to assume that an unknown teller is feminist for the purpose of the question. Prejudices are often statistically more correct than incorrect (though socially problematic), but in some cases are still flat-out inorrect, as they are in this example. Too to much of a "good" thing is toxic.
The Wikipedia article on the effect mentions that the experiment has been done to correct that by explicitly wording it better and it reduces the magnitude of the effect but it still exists.
These are structured as: Do you choose (A) alone, or do you choose (A) and (B)? (A) is often benign and (B) is often something the subject thinks is a good fit. So, in some ways these seem like "trick questions" - I wonder what happens if both options are benign. As the Wikipedia article mentions, in the typical case the subject may end up choosing to use an easy heuristic rather than thinking hard about the actual definition of probability and such.

There are other ways the brain could be Bayesian though. For example, at the lowest level our neurons could use Bayesian inference. How this manifests itself at much higher levels such as with language and planning might be difficult to predict.

I would describe that heuristic as Bayesian, since the subject evaluates the prior evidence for each option, and then chooses the one with the most supporting information, instead of considering the options' logical or mathematical properties.
Most of the decision making while shooting for a hoop in basketball (the example from the article) is subconscious and reflexive, while decisions made related to the conjunction fallacy are conscious. So it's quite possible different mechanisms are used in subconscious motor -control reflexes and conscious 'logical' decision making.

I realise both are implemented using neural networks, but as an analogy digital computers can implement bayesian algorithms, boolean algorithms, predicate logic, etc. It's quite possible our neural systems have optimised to different behaviours to solve different problems, and in _some_ cases these implement or approximate to Bayesian methods.

I don't think that's right, or at least not so binary. The average person (and even highly rational people, when they are being informal) uses "logical" reasononing that is intuitive, not calculated, not so different fro "muscle memory" of shooting hoops. Even professional mathematicians, (in published papers!) sometimes make statements that "feel" correct even though they are disproven under scrutiny (and not due to small (but important) mistakes like getting a sign wrong in a calculation)
> How does the Bayesian brain hypothesis deal with this fallacy?

Maybe brain operates here on a syntactical level: which sentence is more likely to be present in a narrative about a subject and system 2 ([0]) fails to preempt the answer.

[0]: https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow Two systems

With assumptions of noise in the inputs human judgements are probabilistically rational: https://onlinelibrary.wiley.com/doi/pdf/10.1002/bdm.618