Hacker News new | ask | show | jobs
by rosha 2938 days ago
Scraping on high volumes any serp is pain if your business relies on it and most services out there do not work on high volumes or they work and crazy expensive.

I have checked few solutions out there, I am using now proxycrawl. Developers of their api helped me get a very high volume of Serp data from different search engines like Yandex, google and yahoo and bing. I also use them for Javascript crawling as for our project we need lots of content which is rendered via javascript. I am amazed of how their API endpoint works. It is basically sending a URL to their API and you are good to start. Make sure to contact them for some sites as they do not allow you to crawl the world by default unless you prove your use case, they liked my product and that is how it got started. I've really having successful experience with it so I totally recommend, you basically communicate with developers who does lots of work to make it happen. As I am mainly in JS I asked for a Node JS package and they just built it open sourced. https://github.com/proxycrawl/proxycrawl-node

1 comments

It would be interesting to know what technologies they use to scrape on high volume for 0.005 US cent a successful request. I checked this package and it looks decent, i like dependency free libraries. I'll check their API for Bing. Thanks