Hacker News new | ask | show | jobs
by LocalH 38 days ago
I'd rather they disregard robots.txt than the opposite situation, where someone does not use robots.txt on a domain to allow IA to archive it, then for whatever reason the domain lapsed and got swooped up by a parker who then subsequently adds a robots.txt blocking IA from the whole site, which would have then caused IA to remove all historical archives of that domain from public view.
1 comments

Hiding old archives when robots.txt changed was a problem Internet Archive created and could have fixed any time.