Hacker News new | ask | show | jobs
by throwaway2048 2392 days ago
If Google et all detect too much cloned content on a domain, its essentially rank banned forever, for any pages whatsoever.

Having a few different pages isn't going to help.

2 comments

Banned until manual override, so banned until you are substantial to either the community at large or the tech community.
See https://marc.info , its by far the best mailing list archive, popular in open source communities but it absolutely never appears on Google because it has the same content as massively SEOed crap mailing list archives like Nabble. Google has definitely manually unbanned it a few times but it seems to expire after a while.
Yes, but AFAIK all these archives have nothing to do with their primary source, so while we might all prefer no ads there is no objective way to say marc.info is the authorative source over ad ridden sources.

With Wikipedia or stack overflow, I think whoever gets the majority of participants going forward and keeps activity high could start claiming authority in an objective enough sense, and engaged participants are more mindful of organization ethics than random searchers.

This is the important point. A serious fork of Wikipedia with a large chunk of the community behind it would be dealt with manually by Google; their explicit decision making would be the relevant factor, rather than the algorithm.
Would de-indexing the clone pages from the start help, so only the improved pages are indexed?