Hacker News new | ask | show | jobs
by jumperjake 3896 days ago
You're confusing redistribution with scraping. Scraping publicly-accessible content is legal. Redistributing it is not (at least not in the US).

Google scraps webpages as a core competency.

1 comments

Google also adheres to the robots.txt standard. Most of the scrapers I block don't.
Not correct. Google will completely ignore the rules in robots.txt if it deems it acceptable. I think there's a link to this somewhere in this comment page.
They do not index the content but might add the URL, correct. You can have a meta noindex present and they won't index even the URL.