Hacker News new | ask | show | jobs
by clircle 2382 days ago
I'm on the opposite site of the fence, but I'd love to hear some specifics on how R is ill-designed and encourages buggy programs.
2 comments

I'm a heavy user of R, and I like using it a lot. But the language has lots of traps for beginners: code idioms that look correct but are subtly wrong. For example, if you need to iterate over the indices of a vector X, the obvious thing to do is 1:length(X), looks fine and works fine until you happen to pass a 0-length vector, and then it explodes. Similarly, the obvious way to select a subset of rows i and a subset of columns j from a matrix is X[i,j]. But that's wrong too, because if either i or j has length 1, you get a vector instead of a matrix. And I don't even remember off the top of my head what happens if either or both of i and j has length 0. The R Inferno[1] is essentially a big collection of cases like this.

None of this makes R a bad language, in my opinion. R is far from the only language with surprising edge cases like this. People say that R is designed for statistical analysis more than general programming, but I don't think that's exactly true either. Certainly it excels in writing code for statistical analysis, but I've used R a lot more than that, and I plan to continue. It's a perfectly fine general-use scripting language.

I think the real reason R gets such a bad reputation is that a lot of people writing and publishing R code aren't programmers by trade. And you know what? That's fine. Because I'd much rather work in a community that values and celebrates the publishing of code than one that shames people for releasing their code because it's "not good enough".

[1]: https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

IMO the worst is accessing non-existent items in lists or when using the $ or [[ notation in data.frames: the fact that you get back NULL instead of an error breaks code in unexpected ways, and given that R's debug facilities are basically useless, makes it hard to debug complex code.
when indexing you can always pass `drop=FALSE` to prevent returning a vector. It will always return a matrix or data.frame.
Still - that's an excellent example of something that's broken by default and outright dangerous for production use and at the same time very convenient when using interactively. There are probably half a dozen similar other features.

The vast majority of packages are written by somebody taking their interactive session and tidying it up with some functions and tests and then publishing it. But going through and weeding out all these "broken by default for edge case" aspects is a nightmare.

Here are some fun links:

https://www.talyarkoni.org/blog/2012/06/08/r-the-master-trol...

https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

and many others.

I have come to believe that the only people that think R is ok are those that are either:

- beginners that just passed the newbie state, learned a few tricks and feel empowered

- actual experts - that fully understand the minute details of the implementation and data models

I have been using R on and off for a decade, as soon as I stop using for a few months getting back is like a tar pit where I am continuously caught off guard by the myriad of ridiculous problems. Paradoxically as you get better with R your errors start becoming more dangerous, your code starts silently doing the wrong things.

R is unlike any other programming language that I have used before (also on an off) from Perl, C to Python and Java. None of these programming languages have such in incredibly obtuse and illogical and trippy design.

> I have come to believe that the only people that think R is ok are those that are either:

You can virtually say the same thing for every programming language that is made to be easy to learn by hiding complexity, like Python or Ruby.