Hacker News new | ask | show | jobs
by pdfcollect 4582 days ago
Is there a way to replace this robots.txt with a null robots.txt? :)
1 comments

You just ignore the robots.txt file, crawl slowly, and from distributed virtual machines.

Not that you should do that. Robots.txt is a nicety though, the client doesn't have to respect it, and the server doesn't have to allow your HTTP requests.