Hacker News new | ask | show | jobs
by curio 5839 days ago
yes, using the user-agent:

http://en.wikipedia.org/wiki/Robots_exclusion_standard

1 comments

How difficult is it for a bot to lie?

Though if you've made it clear that only x, y and z can crawl your site, and someone spoofs, say, y, then it would be easy to demonstrate that someone has done something they know they shouldn't.

incredibly easy.

and not only can the bot lie, it can disregard the robots.txt file altogether. just like the terms of service document for humans, you can choose to disregard it & deal w/ the consequences (blocked IP's, lawsuit, etc).

robots.txt is just a version of the TOS that computers can read.