Hacker News new | ask | show | jobs
by eddflrs 2894 days ago
Adding <meta name="robots" content="noindex" /> to each page should work. Also as a heads up, having an entry in robots.txt to disallow is not enough since pages can still be indexed if they can be navigated from anywhere else on the web.
1 comments

I thought robots.txt was meant to be pulled from the domain and honoured anyway. At least that’s what used to happen. Just because someone links to you doesn’t mean the spiders should crawl all the content