Hacker News new | ask | show | jobs
by kuzehanka 2609 days ago
Uhhh but pandas is literally a chained architecture? Have you actually used it?
1 comments

I have used pandas extensively, it was my main statistics environment for a couple years before I switched back to R for tidyverse. At the time chaining was not well supported or idiomatic; multi-indexing was all the rage.

I still occasionally use pandas with seaborn when it's not worth it to switch out to R. I don't think it can match the tidyverse+ggplot combo for quickly exploring and making beautiful plots. But this discussion has inspired me to do some googling and it seems like some people are using tidyverse-like workflows in pandas (https://stmorse.github.io/journal/tidyverse-style-pandas.htm...). Doesn't seem quite as smooth but I'll definitely be trying it out next time I'm working in pandas.

I've used both and have two additional comments.

Some of the dplyr elegance comes from the flexible evaluation mechanism in R, whereby mutate(data, col1+col2) works because the second arg is evaluated in an enriched environment. Python eschews this kind of macro-like extensions because, my guess, tampering with evaluation makes a lot of other things complicated (for instance, forget replacing args with their value, that doesn't work anymore). I think the author of dplyr himself in later work has promoted the use of the ~ operator to explicitly block eval of an argument and at least make these departures from regular eval explicit. That means dplyr is ahead for interactive use, but for programming you have to switch to a separate API (the underscore "verbs") and that makes the transition from interactive work to coding a bit steeper. It's all trade-offs, and I am not saying that I know better than either the pandas or dplyr authors.

As to ggplot, if you believe the future of statistical graphics is in-browser and interactive, you should take a look at altair for python (I myself created a small extension to it called altair_recipes). It's based on vega, like ggplot anointed (but not quite ready) successor ggvis and uses the grammar of graphics (or on interpretation thereof) like ggplot, with extensions to interaction. Simpler than D3 by most accounts.