Hacker News new | ask | show | jobs
by taptaptapimin 2472 days ago
This is true, but only technically. Google won't actively scrape anything disallowed in robots.txt, but those resources can still be indexed if found in the many other ways Google aggregates data, all of which is automated.

Robots.txt isn't something that bars access to information. It's just a notice that the administrator does not want large amounts of queries against certain resources.

1 comments

>Robots.txt isn't something that bars access to information. It's just a notice that the administrator does not want large amounts of queries against certain resources.

Many times Robots.txt are implemented with the interest of barring access to information.

This works by relying on scrapers respecting the file, but it's no different than a no-loitering sign which itself cannot actively stop someone who is loitering.

Google doesn't have a Robots.txt disallowing search because it can't handle a large amount of queries against a resource...