Y
Hacker News
new
|
ask
|
show
|
jobs
by
holoduke
2818 days ago
The best way to scrape already for many years is using a headless browser plugin. For example phantomjs with nodejs. That in combination with tor or a large proxy pool is unbeatable by all other alternatives.
2 comments
ziflex
2818 days ago
This is how it works under the hood. But everything is wired for you ;)
link
jotadambalakiri
2818 days ago
I wouldn't scrape over Tor, you would be slowing down the network for people who actually need it. Maybe if you are running a node.
link