Hacker News new | ask | show | jobs
by cwyers 2017 days ago
Thanks, the video helped explain some things, along with this post from the R-devel list:

https://stat.ethz.ch/pipermail/r-devel/2020-December/080173....

The reason for announcing the new lambda syntax at the same time seems to be to enable certain workflows that the magrittr pipe supports. The %>% operator, by default, pipes to the first argument of a function. If you want to pipe to a different argument, you can do:

a %>% func(x, arg2 = .)

It seems like the native pipe doesn't support a placement argument, but you can use the new, more concise lambda operator:

a |> \(d) func(x, arg2 = d)

A little more verbose, but it's not a very common use case, it's more general, and I'd happily trade a little more verbosity for the rest of the improvements. (That said, I haven't played around with the magrittr 2.0 improvements yet, so maybe the difference is going to end up being less than the presentation suggests.)

1 comments

The use of "." as an argument is actually probably one of my most common wtf's with pipes in general.

I tend to use it a lot if I'm just piping a vector to base functions (gsub/grep have x as their third argument.

This syntax looks like it makes that a little harder, but the new error messages are going to make everything so much better that I'm totally fine with it.

It is particularly infuriating in R, because

    lm(y ~ ., data = my_dataframe)
already means "regress the variable y on all other columns in `my_dataframe`." For big, interactive regresions, it's really natural to write

    my_original_dataframe %>%
        do_a_bunch_of_tranformations() %>%
        select(...) %>% # Pull out just the columns you want
        lm(y ~ ., data = .)
and god knows how that last line is going to be interpreted. So disambiguating through some mechanism is necessary anyway. A lambda is much better than some temporary variable that just holds the formula `y ~ .`.
The zfit package is intended to address this issue, with the zlm() and comparable functions that are very thin wrappers around lm() and friends. The ony thing they do is flip the argument order so the data comes first, making exactly this use case much simpler. So you can do:

    cars %>% zlm(dist ~ speed)
(or now)

    cars |> zlm(dist ~ speed)
https://github.com/torfason/zfit
Tbh, I would 1000% rather my coworkers write a lambda function or closure where it's necessary than add a new package depencency just to change the order of arguments in widely used functions.

Plus, I still wouldn't trust the code

    cars %>% zlm(dist ~ .)
to necessarily work the way I want, or to work the same way across package versions.
I think magrittr 2.0 has addressed that problem also.