| This is the Founder of WP Engine. The main points in the article are factually incorrect, but some of it is in fact excellent feedback that we're going to act on. First, on staging areas, it is incorrect that they are counted as duplicate content, because we force a "deny robots everything" robots.txt file on staging. Using the example from the article of our TorqueMag website: http://torque.staging.wpengine.com/robots.txt It's true that some bots will ignore robots.txt, but all the major search engines that matter for SEO, which is the point of the article, do honor it. Some -- including Google! -- will scan it anyway, but it doesn't count for duplicate content. Matt Cutts has been extremely clear on this point, publicly. Second, on duplicate content on the WP Engine domains (e.g. torque.wpengine.com), again what was stated is factually incorrect for Google but is a good point for some other search engines. Here's why: Google maintains a set of root domains that they know are companies that do exactly what we and many other hosting companies do. Included in that list are WordPress.com, SquareSpace, and us. When they detect "duplicate content" on subdomains from that list, they know that's not actually duplicate content. You can see it in Google Search, but it's not counted against you. We have had a dialog directly with Matt Cutts on this point, so this is not conjecture, but fact. However, the suggestion from the article that it's better to 301 that domain is still also very valid. Also, not all search engines are aware of this scenario, and thus one of the take-aways we have from this article is that we should auto-force robots.txt for the XYZ.wpengine.com domains just as we do for the staging domains, so that other search engines won't be confused. So in the end, we came away with a good idea of how to improve. It's a shame the point had to be made in the manner that it was, and intermixed with FUD. |
Thanks again for keeping everything in context here.