Hacker News new | ask | show | jobs
by zepolen 3402 days ago
Python shouldn't be any slower than Node for crawling if you use the right tools.
1 comments

Oh boy. When I lived in San Francisco the Python community had this on their coffee mugs :O
Your bottleneck should probably be managing your requests, not saving the document/assets. I'm dismissive that language speed is non-negligible in scrapping. Rate limiting and being smart about how you're fetching data should probably be your concern. ... sure, if you don't mind slamming a server with concurrent requests, your language choice might start to matter if your IP isn't blocked first.