|
|
|
|
|
by JoshTriplett
3839 days ago
|
|
I think the archive.org crawler should respect robots.txt as it looked at the time of the crawl. As a well-behaved robot, archive.org's crawler should fetch and respect robots.txt each time it crawls. However, archive.org should not retroactively delete old content when the current site puts up a robots.txt. (To answer your other question, the robots.txt standard already allows giving different instructions to different crawlers.) |
|