| HN Mirror

robots.txt does not "prohibit" anything. For some reason people have a misconception that robots.txt is used to block bots.

robots.txt is used to HELP bots. It tells bots what pages to visit and what pages are not intended for consumption. If a bot goes ahead and scraps everything anyway, that's entirely its own prerogative. Particularly for less sophisticated bots without a lot of storage, a good robots.txt can help it not get stuck on dynamically generated content or "useless for indexing" content.