Hacker News new | ask | show | jobs
by las3rjock 5611 days ago
To crawl Google URLs of the form google.com?q=x would be to disregard http://www.google.com/robots.txt , which seems like bad netiquette to me.
1 comments

They aren't crawling, just noticing what pages clients who visit google.com?q=xxx go to next.

If anybody's search toolbar checks a site's robots.txt before sending clickstream data, I would be very surprised.

A client-side robots.txt rule would also make anti-phishing features trivial to bypass...just put a robots.txt on your phishing site.