Hacker News new | ask | show | jobs
by funnyflamigo 1681 days ago
I know some people think all scraping is bad or malicious. I'd like to point out this is a perfectly legitimate use case for it, in fact this is how Google Search operates.

Web scraping done correctly should be barely noticeable if at all to the operators. Don't send 10,000 req/s, have aggressive delays, make your retries extremely generous, try to avoid pages or actions you know are "heavy". You don't need to update data from every product page every 5 minutes.

1 comments

My guess is that scraping is getting heavier because scrapers have to use headless browsers now. And so, probably downloading artifacts they don't need...because they can't tell what's needed or not, at least with js.