Hacker News new | ask | show | jobs
by StClaire 3406 days ago
They start to diverge on the issue of what a probability actually is: frequentists see them as long run averages, and Bayesians see them as degrees of beliefs.

If you have a coin that comes up heads 60% of the time, a frequentist looks at that as, "as the number of times I flip my coin goes towards infinity the proportion of heads I get goes to 60%." A Bayesian thinks, "absent other evidence on the how the coin gets flipped, I'm about 60% sure the outcome of the flip will be heads."

This lets Bayesians talk about the probability of single events, like basketball games, where frequentists can't.

Bayesians also see conditional probabilities everywhere. A conditional probability says, "well, if I know something about the situation, I should include it in my beliefs." Circumstances matter. It doesn't make a ton of sense to talk about the chances I get hit by a car. They very wildly depending on whether I'm standing on the highway or eating in my kitchen.

Another basketball example. The chances that Spurs win changes dramatically if they play the Warriors or if they play the Kings. I might say "they have a 90% chance of winning given they play the Kings, but a 40% chance of winning given they play the Warriors.

I also need a likelihood function. What are the odds I saw my data given my hypothesis is true. If I got hit by a car, what are the chances I was standing in the street? Given the Spurs won, what are the chances they played the Warriors?

We use something called Bayes Rule which allows us to pile on more and more information on something we call a "prior belief," what we thought about our hypothesis before we saw our data. As we pile on data, we expect to change our beliefs. We become more sure of what we thought, maybe we become less sure, maybe we can totally change our minds.

I want to use Bayes' example since I think it's so good. Imagine you came out of Platos cave and saw the sun rise. You'd think, that's weird, I bet that doesn't happen again. The sun goes down, and you spend some time in the dark. The next morning the sun comes up again. Now you're less sure that sun rises are fluke events. Plus you found some people who aren't freaking out about the whole "big ball of fire in the sky" thing. Maybe now you don't expect the sun not to rise tomorrow. Maybe it will, maybe it won't. As you see more and more sun rises, eventually you get to the point where you are extremely confident that the sun rises every morning. You saw more sunrises and updated your beliefs.

We need one last piece of information: a prior. That's that initial belief you're cave-escaping-self had that sun rises are weird and you probably won't see another one. We can estimate them through population data--percentage of games the Spurs won against the Warriors--or we could just make them up. This is just our belief about the truth of our hypothesis; the chance I get hit by a car regardless of where I am.

We take all this put it into Bayes rule, a blender that gives us the probability that our hypothesis is true given we saw our data. We can use this as a new prior too.

One last example. I'm 70% sure the earth is round. I see a picture of the horizon taken from a hot air balloon and I think there's a 90% chance that I would see that if the earth were truly round. Without going into the calculation, I'm now give or take 80% sure the earth is round. I saw data to support my hypothesis and my belief got stronger.

Why do Bayesian analysis? Because someone once published a study that says that frogs can sense earthquakes some time before they occur. That may be true, but I'm skeptical. My skeptical prior would only get moved slightly to become less skeptical, but it would still need more information, a replication of the study by other people, to actually convince me.

1 comments

This guy/girl nailed it. It took me a while to realize the distinction really exist in the method constraints imposed on the process of reasoning within each of these paradigms. I found it helpful to think of it as two different computational models. That although often agreeing on their outcome, the process of arriving at an answer differs quite significantly.