Hacker News new | ask | show | jobs
by weego 5204 days ago
I built a scraper that was nodejs and nowjs that sent instructions to a javascript bot I injected into the page rendered by phantomjs, the bot then scraped and sent snippets back to the server again via nowjs. The real win for me was it was a comet + ajax target which is usually hard to scrape efficiently but I just synced the bot with the comet updates and away we go. Also being headless I could just spin up 20 instances without any performance problems on a cheap rack server.