Hacker News new | ask | show | jobs
by berendhh 4896 days ago
One Question: What robots.txt do you have to obey? http://app.imdb.com/robots.txt or http://www.imdb.com/robots.txt?

The second one is rather restrictive (and the first one does not exist...)

1 comments

The Google crawlers, at least, will only consider a robots.txt on a subdomain valid for the subdomain it was served from.[1]

[1]: https://developers.google.com/webmasters/control-crawl-index...