Hacker News new | ask | show | jobs
by new2628 2141 days ago
Controversial opinion: Bayes Theorem is overrated. In real life usually we have no idea about priors, and we have close to zero chance to get any good estimate of the true probability of something. But we can still get by fine for the most part, by focusing on limiting possible loss and staying on the safe side with large margins.

Many of the claimed cognitive biases go away under this view. One textbook example of Bayes theorem is how doctors overestimate the probability of being positive for a disease. But what are the priors? Maybe those who visit the doctor did something risky the day before or are feeling funny. Maybe the cost of false positive is negligible compared to the cost of a false negative, etc. People are less stupid than what the TED talk crowd claims.

8 comments

It's a cheap trick to start an argument with "controversial opinion" or any other similar phrase.

The funny truth in this case is that it's not only cheap, but a factual counterpoint to your argument:

By stating that what is to follow is controversial, you give a prior to reading your argument. So that when the reader evaluates it, he already does so from the perspective that it's controversial and thus one shouldn't be too harsh in criticizing it further. This is the real life application of the Bayes theorem from the author of the linked article.

You see? You say it's overrated. But you use it anyway.

So the next time you try to shield yourself from critique, try to build a better argument.

> when the reader evaluates it, he already does so from the perspective that it's controversial

He does so from the perspective that the author believes it's controversial. If you are required to actually assume the opinion is controversial because the author said so, I'd start every paragraph with "you owe me money".

Now that's a pretentious dismissal if I ever heard one.
It is an advantage that priors must be explicitly chosen.

There is always a prior. The question is how aware you are of it.

Aware enough to claim that all arguments are smoke screens for people's innate biases.

An ideal Bayesian would employ the principle of maximum entropy for choosing priors [1], soon to discover the problem of underdetermination [2].

Such a person would suffer a death akin to Buridan's ass [3].

1. https://en.wikipedia.org/wiki/Principle_of_maximum_entropy

2. https://en.wikipedia.org/wiki/Underdetermination

3. https://en.wikipedia.org/wiki/Buridan%27s_ass

> There is always a prior. The question is how aware you are of it.

Very well said. Not to mention that being able to tune the prior shows how much you are depending on those assumptions.

How is explicating the prior an advantage? If the prior is arbitrary anyways you could also stick to your unknown prejudice. This shouldn't change any results and if it does you are in trouble anyways, no matter if you explicitly state your prejudice. I'm still suspecting that Bayesian statistics is just kind of a hack to make results look more convincing.
> If the prior is arbitrary anyways you could also stick to your unknown prejudice

One way to think about a prior is to make your prejudices transparent rather than unknown.

But, this might be negative, because you can’t consciously tweak an unknown prejudice. But, you can tweak a prior until your results support your hypothesis. In that sense, Baysian statistics might be more transparent, but less honest.
Bayesian approaches are more transparent regardless of them being "honest" or not.
True, but the question is if transparency is desirable. I would say it is dangerous for three reasons. First, you might be tempted to tweak your prior until your posterior confirms your hypothesis. Second, using Bayesian reasoning, you make it seem that the first procedure is justified. And third, if everyone does the tweaking for example within in a scientific community, nobody would complain, since everyone automatically would confirm their hypothesis with higher posterior probability.
If by prior you just mean "I know something about it" or "everything happens in context"; then that fine. But if thats what you mean then a diminishingly small number of events have "priors" which can be expressed in a neat analytical form, or be approximated, or even be quantified. This is part of the problem of frame and context that ML v1.0 tried hard to solve.

Recall as well that in the Bayesian approach the model itself is not subject to Bayesian updating: its part of your prior. Except that you never update it. So youre not merely choosing how to update parameters given data; you're also choosing what you're not going to update.

There is always a prior only if you really care about computing probabilities. The implicit assumption in Bayesian data analysis is that you go first to "best possible estimate of probability", then to "decision based on that". My point was that you usually need not do the first step.

Example: I wear a bicycle helmet because it costs me next nothing and it possibly saves my life. I don't do any Bayesian analysis implicitly or explicitly, because on one side there is an outcome with value minus infinity, so it hardly matters what probability I multiply it with.

You don't need to think hard about massively asymmetric payoffs.

Now what if you needed a something like a $5k licence to wear the helmet. Would you feel like thinking harder and analysing further than you did? Most interesting decisions are more like this.

"Possibly saves my life" is your prior there, btw.

Yes, but it's unknown. And hence any meaningful estimation made using that prior means that it is based on an assumption, which may or may not hold.
You can always pick a know-nothing prior. For a binary decision it would be 50%.
Bayes Theorem is one of the most fundamental theorems in the history of mathematics. I have yet to work in a field where it doesn't have deeply fundamental applications. In many cases expert knowledge or heuristic rules serve as prior.

Saying it is overrated is like saying sun or air is overrated.

I 100% agree with you.

But hyperbole aside, OP also has a point. If we forget that the estimation of probability in itself has a cost, we could be tempted to put more and more resources into more and more sophiticated methods of data collection and analysis to be more and more certain of your estimate. But if we remember that this process has a cost, some times it's more efficient to just add a margin of safety and move on with your life. Bayes theorem is often used for resource allocation, but the process of optimizing resource allocation in itself has a cost.

There is a reason one learns about it in high school after all.
I agree with the first part. We get by fine for the most part by our own intuitions driven by fear and risk aversion. We are constantly triggered into action, not persuaded -- not by ourselves or anyone. But I think the blog here is a call to be more rational. I would consider Bayes another tool out of many. Unfortunately that doesn't change how we are though. We're still hungry and trigger driven at the end of the day.

Which is why I disagree with the other thing you said. People are pretty stupid. To think you know anything without prior research is stupid. Priors need to be deliberately created (act of learning and understanding and internalizing) for a guess to be educated. Anything without we default to stupid, so most of us are stupid with most things.

But just getting by is an incredibly low bar. We have been tested however with the covid situation. But take covid. Mask wearing isn't a "priors" issue. It's the understanding of what a mask does and how it influences risk that is important. You don't need authority to understand the benefits of a mask, though once understood, it would definitely fall under "staying on the safe side with large margins".

At least that is my take.

That's an interesting response, thanks! I think where I disagree is that I think people are pretty smart, at least in one thing, which is survival -- the proof of that is that those who were not, quickly exited the gene pool. That is a powerful filtering that tunes our estimators.

I agree with your last paragraph, but I think it supports my point. I wear a mask because it has zero cost, and it may save my life. When I took this decision, I didn't estimate any probabilities and I haven't used Bayes theorem. Understanding what a mask does exactly and how aerosol transmission of viruses work precisely is almost irrelevant to my decision -- I could be improving my knowledge ("my priors") by studying virology, but there would still be so many uncertainties, that it would hardly influence my decision.

> In real life usually we have no idea about priors

Priors are your previous knowledge on the topic.

> One textbook example of Bayes theorem is how doctors overestimate the probability of being positive for a disease. But what are the priors?

In this example, doctors overestimate precisely because they don't take the priors into account.

Doing something risky the day before / feeling funny is extra evidence that is assimilated (or should be) into the likelihood ratio P(D|H) / P(D). This is information the patient should share with the doctor.

Of course, if they don't, then the Bayes estimate is the best guess given all the information the doctor has.

Edit: Your criticism about how we choose priors is fair. The better you are at this, the more accurate your answers become. I mention more about this in the "putting it to practice" section.

I agree with you, but my point is more broadly that in reality we often don't go through the steps "1. estimate probability" -> "2. make a decision based on the probability distribution", because step 1. is so error-prone and intractable, that we typically jump directly to step 2. and try to limit our downside.

Of course you could look back and say, given the fact that I took some decision, what would have been my prior if I had used Bayes theorem, but my point is that we don't actually use it for taking the decision.

That's not how priors work.
While I think it's fair to say that it's hard to come up with informative priors for many real problems, the Bayesian framework is pretty robust if you use weakly informative priors
Yup, no prior whatsoever makes much more sense /s https://xkcd.com/1132/
This comic was discussed by the author on Andrew Gelnan's blog. Gelman hated it and posted his opinion. Hyde pretty much agreed, iirc.

Edit: https://statmodeling.stat.columbia.edu/2012/11/10/16808/

Randall Munroe, not Randall Hyde is xkcd. Hyde is assembly language. Brain is mush, apparently...
There should be a named fallacy for that, "linking to a comic" -- although I guess it falls under fallacy fallacy.
Haha, this is amazing! I'd use it in the post, but I fear it's a strawman of the Frequentist side.
Reminds me of the time when the OPERA experiment claimed to have seen superluminal neutrinos: https://en.wikipedia.org/wiki/Faster-than-light_neutrino_ano...

Everyone not in the collaboration basically said "I bet you didn't measure superluminal neutrinos".