Hacker News new | ask | show | jobs
by jebblue 4655 days ago
>> The irony of this has always intrigued me: Google may crawl your servers, but under Google's policies, you may not crawl Google's servers.

It looks like some of their site can be crawled and some not, that's how robots.txt has worked for a long time:

http://www.google.com/robots.txt

1 comments

And search results (the data they have obtained via crawling others' sites) is not among the data that can be crawled.

What are you suggesting?