Hacker News new | ask | show | jobs
by luhego 2360 days ago
I used R when I took an online course on Data Analysis. I didn't like it at all. Its syntax is weird and painful to read. The only nice things about R are Tidyverse and ggplot. I found Python to be a better alternative. You can use Pandas for data analysis y EDA. Matplotlib and Seaborn for plotting. Scikit-learn for training your models. An additional benefit is that Python is a general purpose language that you can use to build a complete application.
1 comments

In almost all of the use cases you mentionned, R blows Python out of the water.

Working with dataframes in R is much much more convenient than Pandas (loc, iloc, etc??)

Plotting is an obvious win for R, matplotlib is horrible, it's powerful yes but it is an absolute pain when compared to ggplot.

Scikit is definitely unmatched but caret is not so far behind. Also, R has a plethora of implemented models that Python lacks (from something as basic as decent quantile regression to time series analysis tools).

As for building a complete application, Python is indeed the go-to.

Syntax wise, using magrittr's pipes is an absolute pleasure. Good luck doing that with Python.

Just as an FYI - the statsmodels python package just released numerous new time series tools in version 0.11 rc1 [1] and also has functions for quantile regression [2]

[1] https://github.com/statsmodels/statsmodels/releases [2] https://www.statsmodels.org/dev/examples/notebooks/generated...

His #1 requirement was not being a painful language and nothing but being not-R can resolve that.

I use R everyday for statistical analysis due to it having certain interfaces and I still hate it every day.

Exactly. I initially liked Pandas, but then I discovered what I can do with data frames in R, visualizations with ggplot, and the SQL-like data manipulation using dplyr w/ pipes from magrittr. R may have the steeper learning curve -- and for certain uses, be inferior to Python -- but it's a wonderful language.