| I wrote a function once for a friend that modified the enclosing environment, and changed + so that sometimes it added two numbers together and sometimes it added to numbers and an extra 1 just to be helpful. I can sort myself out, but thanks for the thoughts. The issue is that I learn these things /after/ R does something absolutely off the wall with its type system. And a lot of my exposure comes from using other people's libraries. For my own work I just use tidyverse for everything. It solves all my complaints, mainly by replacing apply() with mutate(), data.frame with tibble and getting access to the relational join commands from dplyr. I'll cool with the fact my complaints are ultimately petty. > For attach metadata to an anything, why not use attributes()/attr() or the tidy equivs? Isn't that what it is for? I've never met attr before, and so am unaware of any library that uses attr to expose data to me. The usual standard as far as I can tell is to return a list. > It might not make you feel much better, but data.frame is just a special list, c.f. is.list(data.frame()). So, if you don't want to use the convenience layers for data.frame you can just pretend it is a list and reduce the ways of accessing data structures by one. Well, I could. But data frames have the relational model embedded into them, so all the libraries that deal with relational data use data frames or some derivative. I need that model too, most of my data is relational. The issue is that sometimes base R decides that since the data might not be relational any more it needs to change the data structure. Famously happens in apply() returning a pure list, or dat[x, y] sometimes being a data frame or sometimes a vector depending on the value of y. It has been a while since I've run in to any of this, because as mentioned most of it was fixed up in the Tidyverse verbs and tibble (with things like its list-column thing). > `%matrix_mult%` <- function(x,y) { if("data.frame" %in% class(x)) { x <- as.matrix(x) stopifnot(all(is.numeric(x))) } if("data.frame" %in% class(y)) { y <- as.matrix(y) stopifnot(all(is.numeric(y))) } stopifnot(dim(x)[2] == dim(y)[1]) x %*% y } I have got absolutely no idea what that does in all possible edge cases, and to be honest if the problem that is solving isn't actually one I confront often enough to look in to it. It just bugs me that I have to use as.matrix() to tell R that my 2d data is all made up of integers, when it already knows it is 2d data (because it is a data frame) and that it is made up of integers (because data frame is a list of vectors, which can be checked to be integer vectors). I don't instinctively see why it can't be something handled in the background of the data.frame code, which already has a concept of row and column number. Having a purpose-built data type only makes sense to me in the context that at one point they used it to gain memory efficiencies. I mean, on the surface data %>% select(-date) %>% foreign_function()
and
data %>% select(-date) %>% as.matrix %>% foreign_function() look really similar, but changing data types half way through is actually adding a lot of cognitive load to that one-liner, because now I have to start thinking about converting data structures in the middle of what was previously high-level data manipulation. And you get situations that really are just weird and frustrating to work through, eg, [1]. [1] https://emilkirkegaard.dk/en/?p=5412 |