| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by enan 3976 days ago
	Thanks for the comments. Our crawler does respect the robots.txt standard and the nofollow tag. Seems like noarchive is what google recommends. Will look more into it. Although we do put a banner on the index page - we don't have them on each page. Thanks for pointing it out - will fix!

1 comments

Paulods 3976 days ago

Even more important than that for me (possibly for you too) is that you make sure that none of these pages make it into googles index.

The duplication of content (potentially sending the original pages down in search ranks) and the fact that you are polluting the organic search results for the sites you mirror could be a big issue for the owners of the pages.

link

enan 3976 days ago

Good point! There is a robots.txt that prevents the site from getting indexed now: http://hn.getpageback.com/robots.txt

link