Hacker News new | ask | show | jobs
by tacos 4223 days ago
I know we don't reward snarky humor 'round these parts, but I was about to say the same thing. Python seems to own this space and the ecosystem around Python and math/stats/analysis is exceptionally healthy. If there's a specific place where R kicks ass please speak up -- it's fallen off my radar.
3 comments

There are three areas where I think R is the clear winner:

1) An IDE for data analysis/programming: RStudio

2) Easy way to turn your analyses into reports: knitr

3) Easy way to turn your analyses into interactive webapps: shiny

(I also think R wins on visualisation and data manipulation, but I'm biased ;)

R absolutely wins on visualization and data manipulation. I'll spare you the immodesty :-)
I use both Python and R a fair bit. As a language, absolutely I prefer Python to R. However, I think there are two areas where R is better than Python and together, I think they add up to a durable advantage, at least for stats people. 1) Package support. Yes, Pandas and scikit-learn are good, but R still has a definite edge here. Here are three things I've needed lately where R has hands-down better code available: forecasting, frequent itemset mining, and network community detection. 2) Non-programming uses. There are a lot of tasks where you need a computer, but just to do one thing, a plot, calculate a statistic, ... stuff like that. R is better in that use case.
R is in some ways more forgiving to newcomers. Sure, there's all sorts of weirdness around how vectors and matrices work, and don't get me started on the cryptic function naming, but (1) almost all batteries are included -- hardly ever a need to hunt around for packages, (2) RStudio is really nice, with graphics, a shell, a text editor, documentation etc. all in one place, (3) it's mature and well-tested.

I prefer Python myself, but after spending a couple of months with R I do understand why people like it.

(OTOH I'll be a happy person if I never ever have to work with SAS ever again.)

> R is in some ways more forgiving to newcomers.

Oops! sorry sorry,... really sorry, apologies for snorting coffee over you, but given multiple years of experience TA'ing for machine learning / datmining courses I couldnt disagree more. R had them in absolute knots, and yeah they were asked to use RStudio if that helped. They struggled with simple things such as writing a naive Bayes classifier. Most of their mistakes were because of R's weird and silent inconsistencies: scalar or vector, copy or reference.

It is possible that all these 30 odd students every year were stupid but chances are fairly low.

EDIT:

The course has since switched to Java (Knime) and Python and that has gone a whole lot smoother.

Neither Java nor Python are my most favorite languages, but have to concede that Python is massively more consistent than R, so a student has to remember less of special cases, and the whipping boy of dearth of packages seemed less real at least in the context of the course. At least in the academic setting enthought / canopy / anaconda does a marvelous job of it.

I said more forgiving. It's certainly not a forgiving language or ecosystem in absolute terms, you're right on the mark there. But ultimately you have to pick your poison. Do you want to struggle with all of the various quirks of R or do you want to struggle with all of the various quirks of (data analysis in) Python?