| 100% agree, when scraping it should always be done respectfully. - If they provide a API, then use it. - Don't slam a website, ideally spread it out over hours of the day when there target audience is least active (night time). - If you can get cached data from somewhere that works, then use that. Most developers are respectful and only scrape what they really need, not only from an ethical point of view but also a cost and resources point of view. Scraping data is resource intensive and proxy costs can quickly rise to $1,000-$10,000 per month. So most only scrape the minimum they need. The other thing here as well, is that a lot of the most popular sites being scraped, are also massive scrapers themselves. The big ecommerce sites are being scraped, but they are also scraping their competitors too. |