|
|
|
|
|
by hartard
4531 days ago
|
|
I love the execution, but I also see inherent problems. Robots.txt is just a convention to advise crawlers. I'm confident most sites explicitly state this is against their terms of service. You will encounter terms along the lines of: "Unauthorized uses of the Site also include, without limitation, those listed below. You agree not to do any of the following, unless otherwise previously authorized by us in writing:
Use any robot, spider, scraper, other automatic device, or manual process to monitor, copy, or keep a database copy of the content or any portion of the Site." |
|