Hacker News new | ask | show | jobs
by slang800 2536 days ago
On a similar note, tools like [grabsite](https://github.com/ArchiveTeam/grab-site) wisely use robots.txt as a method of finding additional paths to archive when crawling sites.