Hacker News new | ask | show | jobs
by holoduke 2818 days ago
The best way to scrape already for many years is using a headless browser plugin. For example phantomjs with nodejs. That in combination with tor or a large proxy pool is unbeatable by all other alternatives.
2 comments

This is how it works under the hood. But everything is wired for you ;)
I wouldn't scrape over Tor, you would be slowing down the network for people who actually need it. Maybe if you are running a node.