| You're not alone! I think pandas made some design decisions around their transformation functions that make it a lot more cumbersome to use than R's dplyr. It's not obvious from the documentation, though. As an example from the pandas docs [1], in dplyr you can do > gdf <- group_by(df, col1) > summarise(gdf, avg=mean(col1)) In pandas this is similar to > df.groupby('col1').agg({'col1': 'mean'}) But dplyr's summarize it's much more flexible than agg, as you can do all kinds of things to any number of columns. E.g. > summarise(gdf, some_name = f1(col1) + f2(col2)) But in pandas you can apply 1 function to 1 column with agg. [1] http://pandas.pydata.org/pandas-docs/stable/comparison_with_... |