Hacker News new | ask | show | jobs
by minimaxir 2931 days ago
> This is a very bad example of what factors are for in R, because it makes it seem like factors are for defining variables or keys in key value pairs

That is the approach for tidy data, which is used a lot in the R tidyverse (http://tidyr.tidyverse.org/articles/tidy-data.html)

1 comments

>> This is a very bad example of what factors are for in R, because it makes it seem like factors are for defining variables or keys in key value pairs

> That is the approach for tidy data, which is used a lot in the R tidyverse (http://tidyr.tidyverse.org/articles/tidy-data.html)

Do you have a reference to where Hadley et al. suggest using factors in a key-value system? I'm reading Wickham's books at the moment and have not seen this assertion. Indeed, I believe he would not state this, as he explains the utility of factors explicitly:

    A factor is a vector that can contain only predefined values, and is used to store categorical
    data... Factors are useful when you know the possible values a variable may take, even if you don’t
    see all values in a given dataset...
Advanced R, pp. 21-22
It was my interpretation of the original article quote that it was referring to tidy schema, but I could be incorrect. (the gather() function of tidyr names its parameters key and value as well, and the function is described as "Gather columns into key-value pairs": http://tidyr.tidyverse.org/reference/gather.html)
If you are interested in this topic and haven't seen Advanced R, I'd recommend taking a look - the book explains why those functions have key-value pairs as parameter names. Note that the functions you cite aren't related to factors.