|
|
|
|
|
by vegabook
4223 days ago
|
|
yes I have moved (back) to Python mainly because R is too slow when we get beyond a certain data size and the language is not powerful enough when data starts having to be moved around at scale. I have a 5-10 times speed improvement in native Python and another 30x more if I can vectorize things in Numpy. However a huge caveat is that R is much more succinct when it comes to exploratory analysis during what I call the "data rotation" phase because its vectorized nature is so much more efficient at selecting, reducing, cleaning and rotating data, than even Pandas can manage. It's irritating having to write list comprehensions constantly for what would often have been a ridiculously direct and efficient vectorized command in R. Moreover R's graphics leave matplotlib in the dust, though this advantage is eroding with the JS libraries taking over. The other area where Python crushes R is if your data is live streaming. Here you inevitably need a full fledged programming language with proper asynchronous io capabilities and multithreading / multiprocessing that is not batch oriented. |
|