Hacker News new | ask | show | jobs
by GuB-42 2529 days ago
robots.txt is supposed to be helpful to the robot too.

If you write a crawler, you probably don't want it to waste time indexing a list of articles in every possible sort order, trying all "reply" buttons, things like that.

For me, a "Disallow" line in robots.txt means "don't bother, nothing interesting here". It is a suggestion that benefits everyone when followed, not an access control list.

1 comments

>If you write a crawler, you probably don't want it to waste time indexing a list of articles in every possible sort order, trying all "reply" buttons, things like that.

On the other hand, many websites (like wikipedia here) hide interesting pages behind a Disallow.