|
|
|
|
|
by chc
2546 days ago
|
|
This thread is about what behavior we should design crawlers to have. One person said crawlers should disregard noindex directives on government sites, and you replied that they should ignore all robots.txt directives and just crawl whatever they can. If you intentionally ignore robots.txt, that has intent, by definition. |
|
In my younger years the only time I ever dealt with robots.txt was to find stuff I wasn't supposed to crawl.