Hacker News new | ask | show | jobs
by ibab 3606 days ago
Why should not having a BS in statistics prevent anyone from properly learning and applying statistics in a rigorous scientific fashion?

There are a lot of people with a rigorous mathematical background (mathematicians, physicists, biologists, computer scientists, ...) who are perfectly capable of understanding and applying stats concepts at a high level. In addition, these people have a lot of experience with doing scientific research, so shouldn't they be even more qualified to call themselves "data scientists"?

Can you give an example of something that clearly distinguishes a "data scientist" from say a physicist who learned regression from a stats textbook?

1 comments

there are a lot of gotchas that don't seem like errors but completely invalidate analysis when done without a thorough understanding of a technique.

For example you can learn regression from a stats textbook but unless you've gone through a thorough (and painful) graduate-level stats course, you probably haven't seen the edge cases that invalidate assumptions and necessitate a more complex regression e.g. your regression may suggest there is no effect but when you look at the residuals, you may find systematic bias that you can model using a subject-specific random effect or some transformation as a generalized linear model...

That isn't to say you need a graduate level stats degree but applying statistics without understanding the pitfalls can lead to seriously wrong conclusions.

That's a good point, but I suspect that a lot of serious gotchas that a data scientist might encounter in the wild are not taught as part of a graduate-level statistics course. Being able to think critically and quickly adapting to the problem at hand might end up being more important than previous experience in stats (which is still very valuable, of course).