Hacker News new | ask | show | jobs
by dunno7456 711 days ago
"to tell crawler to not crawl" which can be ignored AFAIK
1 comments

It can be ignored (it's the equivalent to a "keep out" sign on a lawn), but I very much doubt Google et al. (Edit: Oops, Bing et al.) will actually ignore it.
The article says Google is paying Reddit to get the data directly from their firehose API, so they wont even bother crawling the public website.
I wonder how much they pay. Reddit profits a lot from showing up on the top for many search queries. I very often do "whatever I'm looking for reddit" (for e.g. product reviews), since the reddit results often provide higher quality information than normal results.
I wonder if these indexing deals will become more antitrust evidence.
Google sometimes ignore it when it makes sense (ie big bank accidentally adds login page to ignore) or to check for spam activity (in which case google doesn't use their bot user agent)