Hacker News new | ask | show | jobs
by AznHisoka 3587 days ago
https://www.linkedin.com/robots.txt

https://yelp.com/robots.txt

There goes all Linkedin + Yelp content from your index.

2 comments

What about https://www.facebook.com/robots.txt

..and medium-sized/small sites are even worse.

The irony of Facebook being a core part of all NSA surveillance programs and their terms of service including their "Automated Data Collection Terms" https://www.facebook.com/apps/site_scraping_tos_terms.php

If you surf LinkedIn logged out, you'll see that there isn't very much information available anyway. And there's no money in people search.

Yelp was very responsive when blekko wrote them; as you can see ScoutJet has the same access as googlebot.