Hacker News new | ask | show | jobs
by maratc 4236 days ago
Will it be a good fit if I, running on a hundred servers, need to scrape just the home page of a million sites? No analysis of the pages, that is done later.
1 comments

The fetcher fit you already...
You are running

   phantomjs phantomjs_fetcher.js
and using it as proxy? The setup instructions are a bit unclear on this.
I want to make it a http proxy in the beginning. But I found it hard to do so. Then I post every to it, but haven't change the name.

But it works like a proxy, that any request with `fetch_type == 'js'` would be fetched through phantomjs and the response back to tornado_fetcher.