|
|
|
|
|
by stdbrouw
3988 days ago
|
|
Definitely agree that those long lists that tell you to first become awesome at combinatorics, linear algebra, then learn all about statistical inference (that is, not actual statistical procedures but the mathematical underpinnings of statistics that would enable you to construct and evaluate methods you invent yourself), then move on to stochastic optimization... those are really more about machismo than about actually helping people to learn data science. Sure, linear algebra is helpful, but whether it's fundamental really depends on the kind of data science you're keen to do. I also generally dislike /r/machinelearning and /r/statistics because they seem to have been taken over by people who will tell you to either get a PhD or get out.
But, for me, just learning whatever I thought I needed to help me solve the problem at hand got me stuck really fast. There's so much statistics where you really just have to learn it first before you can start to see when and why you'd like to use it. It never occurred to me to use hierarchical modeling and partial pooling for a certain set of problems until after I'd read Gelman & Hill. I never thought that inference on a changing process might require different techniques from the techniques for stationary processes until I had to study Hidden Markov Models for an exam. Heck, when I got started with data analysis I didn't even realize that the accuracy of most statistics improves proportional to sqrt(n) and so the next logical step in my mind was always "get more data!" instead of "learn more about statistics!" (If you look at the industry's obsession with unsampled data, data warehouses that store absolutely everything ever and map/reduce, my hunch is I'm not the only one who lacks or at some point lacked elementary statistical knowledge because it just never came up on their self-motivated, self-directed learning path.) So I think the ideal learning path incorporates a bit of both: learn more about what excites you and about what's immediately useful right now, but also put aside some time to fill out gaps in your knowledge – even things that don't immediately look useful – and make some time for fundamental/theoretical study. (x-posted from DataTau) |
|
http://www.datatau.com/
Check out this too: http://www.pyquantnews.com/