Hacker News new | ask | show | jobs
by tansey 3800 days ago
It seems like most of these complaints center around the idea of the inherent subjectivity of the prior. In cases like astronomy and other hard sciences, the prior reflects actual scientific knowledge and is not really subjective at all. In cases where we don't have that kind of evidence, empirical Bayes methods work very well by just peaking at some subset of the data and finding a good point estimate for the prior.

I'm also not sure why the OP thinks that calculating the normalizing constant is a huge issue. Most of the time you rarely need it since you're likely going to end up doing an MCMC or some other sampling method for the posterior, in which case you only need proportionality.

There are lots of problems with Bayesian methods in practice, but most of them revolve around the scalability of modern methods to massive data sets and very complicated models. Many Bayesians tend to think that it's absolutely crucial to quantify uncertainty and that the added computational cost and human effort is worthwhile. In practice, point estimate methods to find MAP or even just maximum likelihood values work really well for most problems. If you look at the trend in most machine learning, for instance, generally people find a cool way to solve some problem with good performance (e.g. SGD + Deep Nets), then some Bayesian lab spends a few years trying to interpret everything as a generative model and coming up with a clever way to sample everything (e.g. Lawernce Carin's lab at Duke has done a lot of this work in Deep Bayesian Nets). The end result is usually better, but by then most people have moved on to a newer problem and the appeal of getting a marginal boost in performance is harder for me to see. The Bayesian nonparametrics crowd has historically done a pretty good job of hitting a sweet spot of compromise on this by keeping a Bayesian view but still (usually) treating everything as an optimization problem first (e.g. variational inference methods).

3 comments

It's odd to complain specifically about subjectivity of the prior when the likelihood is often just as subjective. Gelman puts this well here:

http://andrewgelman.com/2015/01/27/perhaps-merely-accident-h...

It seems like your attitude towards statistical errors is largely going to depend on the risk of making bad predictions.

For example, the problem with the denominator containing "unknown unknowns" isn't much of an issue if you're searching photographs or optimizing ad revenue. It's much more important for something safety-related like building an airplane or a driverless car.

Finance is in the middle: not directly safety related, but modeling tail risk the wrong way could bankrupt the company.

> I'm also not sure why the OP thinks that calculating the normalizing constant is a huge issue. Most of the time you rarely need it since you're likely going to end up doing an MCMC or some other sampling method for the posterior, in which case you only need proportionality.

Right, and when doing Bayesian model selection the evidence drops out of the problem completely. All you need are the priors and the likelihoods.

The claim that ~H is "not itself a valid hypothesis" is dubious. If H is the hypothesis that a certain continuous parameter has a value in the range [a, b], then it's perfectly obvious what ~H means.

Of course it's possible to choose a vague, overly broad hypothesis, but a frequentist analysis of such a hypothesis is going to be just as bad as a Bayesian analysis. "Garbage in, garbage out" is true no matter what tool you use.