|
I would suggest the following topics (forgive me if this is a bit disorganised) Languages: R, Python (pandas, numpy), C++ or Java, Matlab/Octave
Stats & Machine Learning Topics: Neural Nets,Decision Trees,SVM,
Regression (Linear, Nonlinear, Logistic)
Clustering (K-means, Fuzzy C-means, Mixture Modelling, etc.)
Time-series Modelling/Prediction (AR, ARMA, ARIMA, Exponential Smoothing)
Bayesian Techniques
Statistical Model Development
System experience: Linux administration
Big Data Libraries (hadoop at a minimum, Mahout, Hive,
Storm, Yarn as well) Data cleaning and large scale data management
|