Hacker News new | ask | show | jobs
by baldfat 3820 days ago
R has a lot of choices. I find that the Hadley Universe of libraries of ggplot2, ggvis, dplyr, tidyr, stringr, readr and others with the piping of %>% to be the easiest code for reading and getting things done. R has been transformed in the last 5 years but still gets a bad rap which I don't feel it deserves.

R is really more a functional language which most people don't recognize and when they learn the standard core of R with Lisp inspired ideas it really throws them for a loop. I ended up learning Racket (after trying to teach myself Haskell 3 times) and I can say I now "get it," but using the new libraries has really made that point mute.

I love R and am excited to see all the support from Microsoft and other large companies jumping on board.

This does look like a decent way to move to Python though. If someone was to do that I hope they use Rodeo IDE.

3 comments

You mention that R doesn't deserve the bad rap it had 5 years ago. Less than two years ago I learned the hard way that R reference counting had only three possible counter values. 0, 1, and 2+. So if you took a second reference to something then deleted it, that object's reference count would stay at 2+ forever. Then if you modified it, Copy On Write would kick in, even though there was only one reference living. This was the source of crippling inefficiency in production code. I asked about it back then and the reply was yeah, R does that, hopefully it won't soon. Broken Copy On Write was not acceptable to me in 2014.
I'll second that R is really awesome! Being a computer scientist, it opens up for me a lot of statistical packages.
R is the only real game in town for free software for scientific computing these days.

From a political perspective R was in the right place at the right time. It was a decent high level language that could handle matrix processing gracefully. Scipy/Numpy weren't ready for production yet. The others were Matlab, SAS, and Stata, all of which R makes look like APL.

I'm glad the field has a more healthy open source landscape with worthy competitors in the form of Julia and Numpy/Scipy.

R's biggest sin is failure to force people to use functional paradigms by providing juuuust enough imperative sugar to make average Joe programmer feel at home. R is a functional language, and that's the principle under which is should be taught.

That said, R also has a large number of main technical strengths:

1. Fast basic statistics within the REPL so you can test hunches quickly.

2. Cutting edge algorithms that often aren't implemented anywhere else.

3. Hugely strong engineering packages for civil, environmental, defense, aerospace and basically any IRL engineering field you can think of.

R comes free, with these advantages and many more for the average lab tech.

> R is the only real game in town for free software for scientific computing these days.

You have a very very narrow definition of scientific computing, excluding a huge part of the field, i.e. anything written in C, C++ or Fortran. And I've never seen people in aerospace or civil engineering use R, it seems to be mostly popular in statistics heavy, "softer" fields such as biology or economics.

Edit: to give some examples of what I mean: show me a (molecular dynamics|computational fluid dynamics|finite element method|Poisson solver|magnetohydrodynamics solver|electrodynamics solver|general relativity code|quantum many-body solver|lattice field theory code) written in R. I haven't seen any.

Just wanted to see a few. R has such a strong Fortran code base that I knew that they needed to be in R somewhere.

Molecular Dynamic - https://cran.r-project.org/web/packages/bio3d/index.html

computational fluid dynamics - http://search.r-project.org/library/rjacobi/html/xinterp.htm...

finite element method - https://cran.r-project.org/web/packages/RTriangle/RTriangle....

Poisson - https://cran.r-project.org/web/packages/isotone/isotone.pdf

None of those are actually simulation codes. The first and third are pre- and post-processing tools. The second is an interpolation tool. The fourth uses Poisson distributions, which is very different from solving the Poisson equation.
We just have a mismatch in the term "scientific programming" based on our perspectives. You seem like you're in the harder sciences in academia, while I'm in data science in industry.

I'll certainly cede the point that there's a great deal of important scientific code in many languages that can't be accessed from R.

The thing is, R interoperates very well with C, C++ and Fortran. So when someone who uses R needs to solve one of those problems, they'll generally just use C, C++ and Fortran, then you can call the function/program from R, get your results, chart/analyse them, etc...

And of course, R makes data exploration, statistics, and all those easier problems incredibly simple.

Don't get me wrong, I'm a frequent R user, and it is definitely useful for analysing at simulation results. My point was just that there is a lot of things outside of R's capabilities. Even for analysis there are areas where R is of no use, e.g. when plotting data from 3D turbulence simulations, like Q-criterion isosurfaces:

https://www.nas.nasa.gov/SC12/assets/images/content/Chaderji...

Yeah, for legacy or speed, you might need to drop down to lower libraries. But R makes that pretty easy.
> R's biggest sin is failure to force people to use functional paradigms by providing juuuust enough imperative sugar to make average Joe programmer feel at home. R is a functional language, and that's the principle under which is should be taught.

100% in agreement. That states what I have discovered after learning Functional programming.

The sad thing is people don't know that it is functional PAST first class functions. http://link.springer.com/chapter/10.1007%2F978-3-642-40447-4...

And lists + dataframes as first class citizens. Everything (including dataframes) is pretty much a list. Most R packages nowadays have dataframes in the centre of implementation.

Compare it to other languages, lots of code is needed to convert the data from one format to another, because there's no underlying data structure (Python is much better than the rest with regards to that, but still not as good as R).

"From a political perspective R was in the right place at the right time. It was a decent high level language that could handle matrix processing gracefully. Scipy/Numpy weren't ready for production yet. The others were Matlab, SAS, and Stata".

This is forgetting xlispstat, which was frankly the best of the lot. Free, functional (based on a subset of Common Lisp), and with dynamic graphics capabilities that R is now only beginning to match. But the problem I think was that it was maybe too early. In the 1990s, Lisp and functional programming seemed to be on the way out and object-orientation was the big thing (It's telling that the book on XlispStat is called "LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics", hyping the objects rather than the functional aspects). If it had come out now, with the current interest in FP/Lisp thanks to Clojure and Racket, it probably would have been more successful.

"Matlab, SAS, and Stata, all of which R makes look like APL" -- I am a little confused, what do you mean by that?

(and tbh, I would not call either SAS or Stata matrix programming particularly graceful -- it's pretty hard to beat Matlab at that though)

> R is really more a functional language which most people don't recognize and when they learn the standard core of R with Lisp inspired ideas it really throws them for a loop.

Just to clarify, R was initially a dialect of Scheme. On the other hand, with its frustrating silent type changes, you can see its S roots in the same place that gave us C and C++. That's quite a combination in terms of learning curve.

There is Lisp-stuff in R not coming from Scheme. Like object system inspired by CLOS, FEXPRs from very old Lisps, ...