Hacker News new | ask | show | jobs
by floatingatoll 2472 days ago
If you want to make a website accessible to any human but essentially invisible to search engines, configure basic authentication to require and accept any username and/or password, and then set the auth description to “Enter anything to continue”.

Human beings will enter anything and read your article, with the browser caching the credentials for a while, but the vast majority of search engines will treat the 401 as “indexing not permitted under law”, to a degree that your site might not even be returned as a result at all.

Robots.txt doesn’t have the same effect and is soundly ignored by many malicious/uncaring web spiders and tools, unlike 401.

It seems like the smallest thing in the world, but it’s why forums that require you to login to search are so safe against harassment - as long as they block web spidering!