Hacker News new | ask | show | jobs
by hidenotslide 3058 days ago
Good point about pandas dataframes taking up extra space, and the solution of using chunking/generators. 33 GB is what Wes McKinney would call "medium data": https://twitter.com/wesmckinn/status/413159516096585729

The problem is libraries that works fine when everything fits in RAM start breaking down if you aren't careful. Not really python speed issue, but you lose some of the tools you relied on previously.