| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by p4wnc6 3830 days ago

I have had the exact opposite experience with machine learning and statistics. In my experience, those who come from the 'statistics' side tend to use constructs, like null hypothesis significance testing, which are not consistent even from a theoretical point of view. And further, when they use them, they do awful things like p hacking, or using a direct comparison of t-stats as a model selection criterion, which are further rife with theoretical problems, not to mention lots of statistical biases and so forth.

I find the machine learning approach is far more humble. It starts out by saying that I, as a domain expert or a statistician, probably don't know any better than a lay person what is going to work for prediction or how to best attribute efficacy for explanation. Instead of coming at the problem from a position of hubris, that me and my stats background know what to do, I will instead try to arrive at an algorithmic solution that has provable inference properties, and then allow it to work and commit to it.

Either side can lead to failings if you just try to throw an off-the-shelf method at a problem without thinking, but there's a difference between criticizing the naivety with which a given practitioner uses the method versus criticizing the method itself.

When we look at the methods themselves I see much more care, humility, and carefulness to avoid statistical fallacies in the machine learning world. I see a lot of sloppy hacks and from-first-principles-invalid (like NHST) approaches in the 'statistics' side. And even when we consider how practioners use them, both sides are pretty much equally as guilty of trying to just throw methods at a problem like a black box. Machine learning is no more of a black box than a garbage-can regression from which t-stats will be used for model selection. However, all of the notorious misuses of p-values and conflation over policy questions (questions for which a conditional posterior is necessarily required, but for which likelihood functions are substituted as a proxy for the posterior) seem very uniquely problematic for only the 'statistics' side.

Three papers that I recommend for this sort of discussion are:

[1] "Bayesian estimation supersedes the t-test" by Kruschke, http://www.indiana.edu/~kruschke/BEST/BEST.pdf

[2] "Statistical Modeling: The Two Cultures" by Breiman, https://projecteuclid.org/euclid.ss/1009213726

[3] "Let's put the garbage-can regressions and garbage-can probits where they belong" by Achen, http://www.columbia.edu/~gjw10/achen04.pdf

2 comments

51109 3829 days ago

Thanks for the links to interesting papers. I really liked the Breiman paper. I did not try to qualify either machine learners and statisticians as bad or good, just pointing out a difference in their approaches to problems.

I do not know enough about statistics to make a (negative) quality statement about it. I know a bit more about machine learning though, and there I also see things like: Picking the most favorable cross-validation evaluation metric, comparing to "state-of-the-art" while ignoring the real SotA, generating your own data sets instead of using real-life data, improving performance by "reverse engineering" the data sets, reporting only on problems where your algo works, and other such tricks. I believe you when you say much the same is happening for statisticians.

Maybe it was my choice of words (careful, sober). I think its fair to say that (especially applied) machine learners care more about the result, and less about how they got to that result. Cowboys, in the most positive sense of the word. I retraced where I got the cliff analogy. It's from Caruana in his video "Intelligible Machine Learning Models for Health Care" https://vimeo.com/125940125 @37:30.

"We are going too far. I think that our models are a little more complicated and higher variance than they should be. And what we really want to do is to be somewhere in the middle. We want this guy to stop and we want that statistician to get there, together we will find an optimal point, but we are not there yet."

link

ivansavz 3828 days ago

Thx for linking to the Caruana video, very interesting.

link

msellout 3829 days ago

You may have formed your generalization about statisticians from a biased sample. Or perhaps you're conflating statistics (ab)users for statisticians. There are far more people who have heard of a t-stat and r-squared than people I would call statistician.

link

p4wnc6 3829 days ago

I disagree. Most of the egregious stuff is in published statistics literature, particularly in econometrics, psychology, medicine, and biology, from researchers whose full-time job is to use statistics to solve applied problems ("domain statisticians" if you will).

Even if your definition of "statistician" only applied to Wasserman or Gelman types, I'd still say that the machine learning folks of the same level exhibit hugely more caution about the theoretical properties of their models (not a knock against Wasserman or Gelman, just a property of the rigor of e.g. PAC learning versus some ad hoc hierarchical model).

link

msellout 3828 days ago

I take the narrow view on "statistician". I agree that many if not most scientists are poorly trained in statistics even though all major journals try to throw a veneer of mathematics on their publications.

As for the comparison with ML, I think a large chunk of the ML community aims for (with good reason) evidence of predictive capacity rather than theoretical soundness. Not everyone. I'll grant that a good portion care deeply about theory. Look at the arguments between SVM folks and "Neural" Nets folks.

It comes down to a difference in focus. Statistics cares about causal inference. Machine Learning cares about prediction. Nothing wrong with either, but theiir techniques are sometimes ill-suited for the other purpose.

link

p4wnc6 3827 days ago

I agree with your distinction between groups who care about "causal inference" like the debates between Judea Pearl and Andrew Gelman on the role of toy problems in statistics, and groups who care more about "prediction engineering" (as long as we are careful to also admit that people in the ML prediction engineering camp care very, very much about the theoretical properties of their methods, especially in avoiding overfitting, because engineering predicition in a climate of overfitting is useless).

I would just add a big third category that probably encompasses the vast majority of people who "work in statistics" and that would be people who are not interested in causal inference nor in predictive efficacy but are interested in a much less rigorous idea of "explanatory modeling" -- and this group generally is very poor with statistical hygiene.

link