Hacker News new | ask | show | jobs
by ejstronge 2930 days ago
>> This is a very bad example of what factors are for in R, because it makes it seem like factors are for defining variables or keys in key value pairs

> That is the approach for tidy data, which is used a lot in the R tidyverse (http://tidyr.tidyverse.org/articles/tidy-data.html)

Do you have a reference to where Hadley et al. suggest using factors in a key-value system? I'm reading Wickham's books at the moment and have not seen this assertion. Indeed, I believe he would not state this, as he explains the utility of factors explicitly:

    A factor is a vector that can contain only predefined values, and is used to store categorical
    data... Factors are useful when you know the possible values a variable may take, even if you don’t
    see all values in a given dataset...
Advanced R, pp. 21-22
1 comments

It was my interpretation of the original article quote that it was referring to tidy schema, but I could be incorrect. (the gather() function of tidyr names its parameters key and value as well, and the function is described as "Gather columns into key-value pairs": http://tidyr.tidyverse.org/reference/gather.html)
If you are interested in this topic and haven't seen Advanced R, I'd recommend taking a look - the book explains why those functions have key-value pairs as parameter names. Note that the functions you cite aren't related to factors.