Hacker News new | ask | show | jobs
by joelschw 1710 days ago
I'd take a lazy, typed data manipulation language over pandas all day
2 comments

If you can completely stay away from Python/pandas, get all your work done with typed languages like Scala/Java, that's good. A lot of scientists and non-CS folks are using Python/R. They need to avoid mish mash of bringing in Spark and SQL for some bits and then getting back to Python/R. Native Python, especially, offers mature ways to handle data in the 100s GB data. Learning to incorporate Dask and Numba is going to be far easier than teaching all these folks distributed programming and spinning up Spark clusters, when that can be un-necessary in many cases.
Depends for what, Im a Java guy at heart, but honestly for quick little analysis, pandas is way faster, and I barely can code my way out of a paperback in Python.

I even hate python and would never use it ... but I cant find better than pandas for my crazy large time series and always bespoke questions from the biz.