| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cuteboy19 1324 days ago
	Why does pandas code often feel ugly and clunky compared to the equivalent SQL? Is there no better way to do this?

2 comments

wswope 1324 days ago

I find Pandas vs. SQL to be complimentary, rather than an either-or type situation. For anything in the tens of GB range or smaller, it’s easy enough to move between the two with read_sql_query and to_sql.

The general strategy is to build the core of any dataset as a SQL query that handles joins and performance-sensitive parts of the query, then polish/plot/yeet into weird shapes with Pandas since it offers much greater expressivity.

link

cuteboy19 1324 days ago

What bugs me about pandas is that it is so copy heavy. I just wanted to know if there was some pythonic way to get performance without just writing normal SQL

link

wswope 1324 days ago

Any specific examples you have in mind?

link

throwamon 1324 days ago

https://github.com/machow/siuba for one

link