Y
Hacker News
new
|
ask
|
show
|
jobs
by
anon4
3839 days ago
robots.txt already lets you specify per-robot behaviour. You can trivially opt-out of crawling, but opt-in to archiving by explicitly allowing archive.org's bot and disallowing all other user agents.