Hacker News new | ask | show | jobs
by closed 1606 days ago
Author here. My concern in the article isn't that you can't do groupby fast, but that the approach above is not composable.

* If f() converts grouped data to something ungrouped, then you can't use a similar function f2(f(group))

* If f() returns a grouped object, then you can't do basic operations like f(grouped) + 1, because DataFrameGroupBy, SeriesGroupBy do not define basic operators like addition. Let alone operations against other grouped data.

A lot of this is worked out in siuba now, and this doc explains a bit more:

https://siuba.readthedocs.io/en/latest/developer/pandas-grou...

1 comments

Yea makes sense. I actually come from R and the deplyr world. Was really happy to read your article :). I might give siuba a try.