Hacker News new | ask | show | jobs
by ComputerGuru 3046 days ago
You also don’t want to get your server blocked by yelp if they do rate limiting.
2 comments

It's why I'm using proxies, every request is routed through different proxy address and the application as whole is rate limited. So hopefully I'm not making too much traffic on yelp. They are just a perfect example because they are using all types of data I'm looking for. When I find more good examples I will add them and rotate them for every page load.

Btw when it comes to ToS and scraping, this is not much different from accessing their website through normal browser only instead of rendered content we should you analyzed data. The page is only loaded once same as in browser.

They have fairly aggressive scraper detection (and this is also against their ToS)