| > Tidy features (like pipes) are detrimental to performance. Detrimental to the runtime performance; if you happen to be reading and processing tabular data from a csv (which is all I've ever used R for, I must admit), then you get real performance gains as a programmer. For one thing, it allows a functional style where it is much harder to introduce bugs. If someone is trying to write performant code they should be using a language with actual data structures (and maybe one that is a bit easier to parallelism than R). The vast bulk of the work done in R is not going to be time sensitive but is going to be very vulnerable to small bugs corrupting data values. Tidyverse, and really anything that Hadley Wickham is involved in, should be the starting point for everyone who learns R in 2018. > languages like R and MATLAB that were designed for data frames and matrices Personal bugbear; the vast majority of data I've used in R has been 2-dimensional, often read directly out of a relational database. It makes a lot of sense why the data structures are as they are (language designed a long time ago in a RAM-lite environment), but it is just so unpleasant to work with them. R would be vastly improved by /single/ standard "2d data" class with some specific methods for "all the data is numeric so you can matrix multiply" and "attach metadata to a 2d structure". There are 3 different data structures used in practice amongst the R libraries (matrix, list-of-lists, data.frame). Figuring out what a given function returns and how to access element [i,j] is just an exercise in frustration. I'm not saying a programmer can't do what I want, but I am saying that R promotes a complicated hop-step-jump approach to working with 2d data that isn't helpful to anyone - especially non-computer engineers. |
For attach metadata to an anything, why not use attributes()/attr() or the tidy equivs? Isn't that what it is for?
It might not make you feel much better, but data.frame is just a special list, c.f. is.list(data.frame()). So, if you don't want to use the connivence layers for data.frame you can just pretend it is a list and reduce the ways of accessing data structures by one.
You can paper over the distinction between data.frames and matrices if it comes up for you often enough. E.g.
`%matrix_mult%` <- function(x,y) { if("data.frame" %in% class(x)) { x <- as.matrix(x) stopifnot(all(is.numeric(x))) } if("data.frame" %in% class(y)) { y <- as.matrix(y) stopifnot(all(is.numeric(y))) } stopifnot(dim(x)[2] == dim(y)[1]) x %*% y }
d1 %matrix_mult% d2
... but I'll grant that isn't the language default.