Hacker News new | ask | show | jobs
by d4rkp4ttern 1116 days ago
I think I understand pagination — but can you elaborate on proxy rotation?

> combines the practicality of language models with the powerful features of a traditional scraper such as pagination and proxy rotation

3 comments

When scraping websites, it’s often necessary to change your IP address to bypass the website’s anti-scraping measures. To achieve this, there are proxy services out there that are designed with web scraping in mind- so it’s easy to programmatically change your IP address from within a scraper program.
It sends our request over a lot of proxies so your scraper does not get rate limited or blocked by ip address.
You basically switch out the proxy you use to scrape. Services by Google or others can identify scrapers cause they'll use the same proxy to request paged