| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lucideer 959 days ago
	There's no real cat & mouse game here (yet) - sites don't do anything to mitigate this. Sites deliberately make their content available to robots to gain SEO traction: they're left with the choice of allowing this kind of bypass or hurting their own SEO. I say "yet" because there could conceivably be ways to mitigate this, but afaik most would involve individual deals/contracts between every search engine & every subscription website - Google's monopoly simplifies this somewhat, but there's not much of an incentive from Google's perpsective to facilitate this at any scale.

1 comments

tiagod 959 days ago

Google publishes IP ranges for GoogleBot. You can also reverse-lookup the request IP address - the resolved domain should in turn resolve to the original address.

link

ForkMeOnTinder 959 days ago

Does anyone else remember 10 years ago when Google would penalize sites for serving different content to GoogleBot than to normal users? Those were the days.

link

omoikane 959 days ago

> Google would penalize sites for serving different content to GoogleBot than to normal users

Listed under spam policies:

https://developers.google.com/search/docs/essentials/spam-po...

   "Cloaking refers to the practice of presenting different content to users and search engines with the intent to manipulate search rankings and mislead users"

The top of the pages says sites that violate the policies may "rank lower or not appear in results at all".

link

muttled 958 days ago

It's infuriating when you see part of your desired information in the search results and then open the page to find a paywall. IIRC ExpertsExchange were doing that for a long enough time that it was obvious the policy was not enforced. At least not evenly.

link