Hacker News new | ask | show | jobs
by TomAnthony 4119 days ago
Google disagree with your assessment:

https://productforums.google.com/forum/#!searchin/en/cookie$... (see the response is marked best answer by Matt Cutts - head of web spam at Google).

If I have never been to the site I'd land on the unfiltered page that would be a good result, and if I had a cookie (which seems to be a session cookie from a quick look) then it is likely I was recently at the site and so the filters are likely relevant but if not they are easy to change.

'Cloaking' has negative connotations and is more of a concern when there is an attempt to mislead search engine. In this instance, there is a big problem with your suggested fix -- the Panda algorithm would see many very similar pages which might actually make things worse (which I agree is silly, as your solution would otherwise have some upsides, but there is often a trade off in these situations).

2 comments

That's a simplistic way of thinking about the problem -- as a search engine professional (not SEO), I'd never recommend something that depends on GoogleBot figuring out that I'm not really cloaking.

The duplicate content problem you describe is fixable (edit: and is already a problem, I'm only recommending changing links, not adding any pages to the site.)

And by the way, there are plenty of websites that force crawlers to use cookies in order to crawl the site. I don't know how GoogleBot deals with that, but I bet it involves crawling with cookies... no matter what the forum post says.

Yeah - I don't disagree that there isn't possibly some level of risk. But if your concern is "GoogleBot figuring out that I'm not really cloaking" based on the presence of cookies then I'd challenge (what I think is) your implication that having cookies on your site means Googlebot might suspect you of crawling.

As to Googlebot's use of cookies - there is debate and folklore, but in the tests I have run I have not seen Googlebot ever send back a cookie that I have sent it.

Google do manual reviews of pages, and I am confident the site in this example (for the case in question, at least) would pass that without a problem.

I'm (genuinely) interested in your proposed solution for dealing with the duplicate content problem. The problem with the Panda algorithm is tends to be a bit touchy and it seems easy to fall foul of it even with innocent situations like this one.

That's not my implication, nor what I said! I said that this website should choose a link method which is unambiguiously not cloaking. Then there's no chance that you'll confuse search engine bots.

The duplicate content issue is not in play for my suggestion; as my edit above states, I'm only recommending changing links, not creating any new urls.

Exactly +1