Hacker News new | ask | show | jobs
by baldfat 4106 days ago
in R loops are generally frown upon due to this issue.

BUT R is not slow when it comes to parallel processing or using Revolution Anayltics also has speed ups if you need it. R also has dplyr is speedy and the data.tables is even faster. I think the original Julia speed claims were a little biased to Julia and well there is plenty of awesome things about R, but "slow" isn't a far statement. There is a reason why R has grown so much.

Interesting argument for R usage: https://matloff.wordpress.com/2014/05/21/r-beats-python-r-be...

1 comments

"R is slow" refers to the main implementation of the language --- to code written in R --- and is completely fair. The packages you're talking about are mostly written in C or C++ with nice R interfaces. So R as a statistical package or R from a user's perspective is often quite fast, but that's because it is relatively easy to interface with C and C++.

(This comes up often, and I'm not sure why I'm compelled to reply today, but there it is.)

But MANY programming languages work this way. I would say that coding in R it doesn't matter if it is C or Fortan why say programming in R is slow even though in practice it really isn't with some new solutions?
Many languages are slow. (Or, more concretely, many languages have only slow implementations.) If C or Fortran interfaces are easy, speeding up the language shouldn't necessarily be a priority. But that doesn't make the language itself fast.

As for why this can matter: a big part of my job is designing and prototyping new statistical estimators. "Prototyping" means running them through lots of simulations to explore their properties in small samples. My two options in R are:

1. Program up the estimator in pure R, in which case the simulations can take days to weeks to complete.

2. Program up the estimator in C and write an R interface, in which case the simulations will run 10 to 100 times faster. (Depending on the exact operations used.) The code will take much longer to adjust if I need to change the estimator, which happens frequently, and debugging will take longer.

3. Program up the estimator and simulations in pure C, making it still faster but much more brittle.

This is a fairly iterative process -- I'll typically discover errors in the math as I run the simulations or after I've run them, or the simulations will reveal unappealing properties that need to be fixed. But once I have "good" estimator and "good" simulations, I write it up in a paper and am done with it. (Gross simplification of my actual job, but accurate enough.)

So the distinction between "R's speed" and "C's speed" very much matters, and if R were actually as fast as C it would make my life much easier.

Of course, for people who are "coding in R" by running data analysis in a script the distinction doesn't matter. But I already said that in the comment you're replying to.

Thank you for actually replying and clarifying your statement, I enjoy learning different ways of thinking. I guess I am changing my word for R to "Fast Enough for Me." I can run my scripts in under 20 seconds, but a few years ago it could be 2 or 3 minutes.