Hacker News new | ask | show | jobs
by eli 5072 days ago
I thought it was weird that CL didn't add a Disallow line for padmapper to robots.txt from the start (just from a PR perspective).

But robots.txt has no special legal authority, it's just a convention used to communicate a publisher's intent. I'm pretty sure the C&D letter made it 100% clear that CL did not want Padmapper crawling their site or using their data.

2 comments

Padmapper doesn't crawl Craigslist. That's not how it happens.
...any more. Now they are using a third party, but at the time Craigslist sent the C&D they were scraping the site directly.
I know, but I thought the existence of robots.txt was why Google is allowed to crawl sites. If a site disagrees with the crawling they can add a robots.txt entry and Google will honor it. It at least shows that you are giving the publisher an option.