Hacker News new | ask | show | jobs
by chalst 4975 days ago
You can do access control on the contents of HTTP_REFERER: if the browser visits a page in your robots.txt by following a Google link, serve them up a 403 forbidden. (In Apache 2.4, this can all be done using mod_authz_core.)

You could maybe say in your 403 forbidden message that Google has been forbidden from indexing the page (use ErrorDocument). If enough sites did that, Google might change their policy.

1 comments

Google's default for logged in users is to use https and strip searched phrases when leaving SERP, so HTTP_REFERER will be empty. A lot of security software also cuts HTTP_REFERER. Being behind proxy may cause it to be empty, too. In general, I don't think you can rely on headers sent by the the browser. You don't know if they are real or forged.