Hacker News new | ask | show | jobs
by gardnr 1665 days ago
Have you thought about pushing the links onto a queue and running multiple scrapers off that queue? You'd need to build in some politeness mechanism to make sure you're not hitting the same domain/ip address too often but it seems like a better option than a serial process.