Hacker News new | ask | show | jobs
by drcube 4924 days ago
>The archive.org team does follow robots.txt and I believe they remove content retroactively meaning if you update your site with a robots.txt it will delete the old content (which I think sucks).

Every time the "Change Facebook back to the way it was!" brigade came out, I would link to the wayback machine's copy of facebook.com from 2005 and say "Is this what you want??". Now I can't do that anymore because of stupid robots.txt.

1 comments

I hope they have backup of this old content. This robot.txt policy is crap. robot.txt should not be taken into account retroactively when the site owner has changed.