Hacker News new | ask | show | jobs
by barrkel 5596 days ago
And of course, if you change the Referer header to http://www.google.com/, you get a different page with the solution at the bottom.

I find myself wondering though - why don't they get punished more for feeding different content to the GoogleBot vs what the normal viewer sees? Isn't it basically cloaking - even though clicking on the link in Google search results will still serve up the solution, a normal, organic link wouldn't?

http://googlewebmastercentral.blogspot.com/2008/06/how-googl...

"Cloaking: Serving different content to users than to Googlebot. This is a violation of our webmaster guidelines. If the file that Googlebot sees is not identical to the file that a typical user sees, then you're in a high-risk category. A program such as md5sum or diff can compute a hash to verify that two different files are identical."

1 comments

Google sort of encourages this kind of cloaking. It's called First Click Free.

http://www.google.com/support/webmasters/bin/answer.py?hl=en...

Maybe Google doesn't want holes appearing in their searches when major sites go behind paywalls (newspapers?).
Of course, I had forgotten...
Google should reconsider their position on this I think.
If Google reconsidered their position on this, a vast archive of subscription-only NYTimes, Financial Times, WSJ, etc. content would suddenly become inaccessible.

I can't count the number of times I've seen an interesting article on Reddit or HN, clicked through, found myself butting up against a paywall, and then just Googled the title and been able to read the whole thing for free. You think that these companies would suddenly put their whole archives online for free if they didn't get a bunch of search traffic for it?

I wish google would put ACM and IEEE's feet to the fire. Their crawler definitely sees full PDFs, and we don't.
Acceptable loss in my opinion.