Hacker News new | ask | show | jobs
by oefrha 778 days ago
It doesn’t. You can block their crawler with robots.txt and send DMCA takedown requests for archived pages as the domain owner which they will honor.

Edit: I was under the wrong impression that if you specifically call out ia_archiver in robots.txt they would honor it. It’s been completely ignored since 2017.

3 comments

If I remember correctly they changed their policy quite some time ago and started to ignore the robots.txt, but not 100% sure about it
robots.txt is a suggestion not a block
Their crawler now ignores robots.txt.