Hacker News new | ask | show | jobs
by jswanson 4935 days ago
Search engines:

- Respect robots.txt (as mentioned elsewhere) which will often provide a limited subset of all data available

- Give something in return (potential traffic) for the data they reap.

I fully agree that scraping is great, and do it myself frequently. Site operators do have legitimate concerns in some situations though, and it probably comes from feeling as if they are being 'ripped off' somehow.

No one in their right mind is going to object to incidental scraping for personal use.

However, scraping is often scripted into cron or the like and that data is then used to profit someone else. I'm usually cool with that, but if someone is running a web site and they are dependent upon ad revenue to keep the servers running, I understand objecting to it.

1 comments

Good rules of thumb.

> No one in their right mind is going to object to incidental scraping for personal use.

It would almost certainly involve stripping ads when re-purposing the content.