| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fbdab103 1260 days ago

>But if we order by performance/memory efficiency

Right there is the disagreement. Like many (most?) people, all of my data munging is in small/medium data where 10 million+ rows is rare. A multiple of pandas performance will not be noticed for the majority of my operations.

Transitioning to a new api on performance alone is not enough to sway me. After all, I write in Python ;). If I were concerned about better throughput, my first alternative would be Dask - it should give better local performance, but could theoretically scale to enormous data without any code changes.