Hacker News new | ask | show | jobs
by mariuskempe 4762 days ago
I should like to give a contrarian comment about this, because it is on top of the front page and seems to be being received positively. This book is probably not a good way to learn about statistical inference. It has quite confused explanations of both Bayesian and frequentist approaches. The preface seems to imply that programmers, by virtue of being able to use computers, don't need to take a rigorous mathematical course in Bayesian methods. However the text actually uses mathematical notation throughout, and as far as I could tell it is often not explained. I noticed at least one case where a probability distribution (gamma) was described only through plots i.e. without specifying its pdf or how you could derive the pdf. I think the kind of discourse that this book exemplifies is halfway to cargo cult 'statistics'.
4 comments

I've got exactly the same feeling. Could you suggest a good introductory textbook into MCMC? They left it as a mysterious blackbox and I'm not very uncomfortable with using mathematical blackboxes I don't understand.
David Mackay covers MCMC in his exhaustive book titled "Information Theory, Inference, and Learning Algorithms" available for free here: http://www.inference.phy.cam.ac.uk/itprnn/book.html

They are covered near the end of the book. It should be enough to familarize yourself with and understand the basic concepts of MCMC. Anything more in-depth will require a strong mathematical background.

BTW : There are probably a ton of books that cover MCMC out there - that's just one I liked and which is freely downloadable.

You can also get a PDF of Barber's BRML or look in Murphys ML text, which isn't freely available as PDF

Check inside title page, make sure you get 3rd printing of Murphy's: http://www.cs.ubc.ca/~murphyk/MLbook/errata.html

http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=...

I have some background (grad student in cfd, thinking about switching to some sort of data analysis later on) but my measure theory and probability skills are rusty (on the other hand numerical linear algebra, functional calculus and complex analysis are superb). What would be a good book for my level?
You'll have no trouble.
I'm sorry! I misread your question as 'Would this be a good book for my level?'.
I can honestly say this book changed the way I think about everything. I can't recommend it highly enough.
http://www.amazon.com/Data-Analysis-Bayesian-Tutorial-Public... is short and reputable on Bayesian statistics. On MCMC specifically, I don't know, but MCMC is really a kind of algorithm that lets you find the answer to a mathematical question (so I think understanding the math is the right thing to start with).

PS. There a second edition of that book, but I've heard that the first edition is better, because the second edition added a different author and expanded the book.

Would you perhaps be willing to contribute to the project in order to improve its explanations?
Well, no, because I don't know any good reasons for using Bayesian methods (except when prior probability distributions can be found objectively through some previous experiment etc).
how do you reconcile "I don't know any good reasons for using Bayesian methods" with the fact that Bayesian methods revolutionized spam-filtering? (or maybe you disagree they did?)
Naive Bayes revolutionzed spam filtering because it is incredibly easy to implement and understand, and was reasonably effective for early spam, not because it was the best model for detecting spam. There's a reason we started seeing ads for "v1agra" and snippets of prose -- it's pretty easy to game Naive Bayes.

On the other hand, the GP's assertion that there is seldom a need for using Bayesian methods is also unwarranted; they are the basis for so many machine learning algorithms in common use -- particle filters, for example.

That's a good question and I was asking myself that I after I wrote that comment. I think my objection is more to the 'Bayesian' and less to the 'methods', if that makes sense. That is, I think constructing and updating models using Bayes' theorem can be (as people doing spam-filtering have shown) a good way of making predictions, but that it is the frequentist properties of the models that actually matter (cf. the ubiquity of cross-validation: 'the proof of the pudding is in the eating'), not the fact that they let you maintain a probability distribution over parameters.
To add to a comment below -- naive Bayes is a simple classifier which doesn't really have much in common with full-on Bayesian methods.
Is this perhaps a suggestion that something you don't want to personally contribute to shouldn't be criticised?
>without specifying its pdf or how you could derive the pdf

Would you or anyone happen to know of a good book that discusses the derivation of various advanced probability distributions? It is quite frustrating that every ML or stats book I come across run through various distributions without giving the reader any sort of motivation or intuition behind them. Without that intuition how am I supposed to have any idea when to apply one vs another?

I honestly can't recommend a book for this. The best resource I've found is MathWorld. I've picked up a bunch of very helpful intuitions from it, including:

- Cauchy: the horizontal distance from the origin at which an arrow shot at a random angle from a point below the origin hits the x-axis

- Gamma: how long you have to wait for the nth event in a Poisson process

I'm sure these must be books that I haven't read.

It's amazing how popular the term "Bayesian" is amongst people who don't really know what it means or quite where it fits in the context of other statistical paradigms.