Hacker News new | ask | show | jobs
by msellout 3828 days ago
I take the narrow view on "statistician". I agree that many if not most scientists are poorly trained in statistics even though all major journals try to throw a veneer of mathematics on their publications.

As for the comparison with ML, I think a large chunk of the ML community aims for (with good reason) evidence of predictive capacity rather than theoretical soundness. Not everyone. I'll grant that a good portion care deeply about theory. Look at the arguments between SVM folks and "Neural" Nets folks.

It comes down to a difference in focus. Statistics cares about causal inference. Machine Learning cares about prediction. Nothing wrong with either, but theiir techniques are sometimes ill-suited for the other purpose.

1 comments

I agree with your distinction between groups who care about "causal inference" like the debates between Judea Pearl and Andrew Gelman on the role of toy problems in statistics, and groups who care more about "prediction engineering" (as long as we are careful to also admit that people in the ML prediction engineering camp care very, very much about the theoretical properties of their methods, especially in avoiding overfitting, because engineering predicition in a climate of overfitting is useless).

I would just add a big third category that probably encompasses the vast majority of people who "work in statistics" and that would be people who are not interested in causal inference nor in predictive efficacy but are interested in a much less rigorous idea of "explanatory modeling" -- and this group generally is very poor with statistical hygiene.