Hacker News new | ask | show | jobs
by wenc 2257 days ago
Data science is an overloaded term, but even so there are some salient differences between it and statistics.

Data science more closely related to "statistical learning" and the knowledge required overlaps with but looks quite different with that of conventional statistics.

An easy way to get a sense of the difference is to compare the table of contents of a book like ISL (PDF free) [1] to the undergraduate curriculum of a statistics program. You'll find that that the focus and indeed culture of data science is really quite different from that of statistics.

Leo Breiman wrote about this in his paper "Statistical Modeling: the Two Cultures" [2]. Conventional statistics belongs to one culture, and statistical learning/data science sort of veers toward to the other (though not completely).

Much has been made about how "data science" is just statistics dressed up to look new, but I'm not convinced this is true. I'm also not convinced that pure statisticians have the right training to be data scientists -- additional training and mindset changes are needed. The reverse is also true: most data scientists lack the rigor and epistemological training to be statisticians.

[1] http://faculty.marshall.usc.edu/gareth-james/ISL/

[2] https://projecteuclid.org/euclid.ss/1009213726