|
|
|
|
|
by NumberCruncher
3508 days ago
|
|
As a statistician I spend ca. 80% of my time collecting and transforming data. Maybe because I never had the luxury having 1-2 own data engineers doing that for me. Having worked with SAS, SPSS, Matlab and python and tried some other tools I would say that the choice of statistical programming language does not make a big difference. If you once understand a modelling process you can reproduce/use it in any language as long as there is a documented package for it. On the other side knowing how to work with data is IMHO more important. Being an SQL pro, knowing how to think in data sets instead of data records, when to use flat tables, how to use vectorization and matrix manipulation even for all day tasks especially in "in memory" systems is essential. I would say SQL + R/python makes a good combination. With that you can solve a lot of problems at least two different ways. R gets integrated step by step in DWHs, what makes a lot easier. I hope SAS dies a short and painful death, but could be also a valid choice. |
|