|
|
|
|
|
by qwhelan
2355 days ago
|
|
FWIW it looks like pandas is slow/OOM-ing because the benchmarks solely use Categoricals, which aren't as heavily used by pandas users compared to R. In particular, I suspect the benchmark sizing is forcing falling back from numpy's int64 to Python ints as categorical labels, which easily could explain a 10x or more differential. |
|