|
|
|
|
|
by Spare_account
3351 days ago
|
|
Here's a slight modification to the GP proposal: - Respect robots.txt at the time you crawl it. - If robots.txt appears later, stop archiving from that date forwards. - Preserve access to old archived copies of the site by default. - Offer a mechanism that allows a proven site owner to explicitly request retrospective access removal. If archive.org have recorded the date that they first observed a robots.txt on the sites currently unavailable, they could even consider applying the above logic today retrospectively. Perhaps after a couple of warning emails to the current Administrative Contact for the domain. |
|
It should be "a proven content owner", just buying a site shouldn't allow someone to remove it from archive.