Hacker News new | ask | show | jobs
by grn 2792 days ago
Correct me if I'm wrong but I think publishers are trying to have the cake and eat it too. They want their content to be discoverable in search engines. They also want readers to pay for it. Given how search engines currently work this is a self-contradictory model.

I imagine search engine could offer publishers an API to send content for indexing without having to publish it. Google and others could even charge publishers for that and then published could ask readers for subscriptions.

3 comments

I absolutely support such programs. Similarly to adblockers and DRM, they show that the mechanisms used to generate money simply does not work. Technically it is impossible to expose content to a (web crawler) robot but not to a human (inverse CAPTCHA). And technically it is impossible to control pixels on a free user device.

Publishers, if you want to have people pay for your content, make honest paid subscriptions and deal with it that you vanish from the openly accessible web.

Not necessarily. I could imagine Google adding a "paid crawling" service where you tell Google how much your content costs, explicitly allow them access via some authenticated method, and then they display the price next to your content in search results.

You're imagining that it can't work because it can't work in a generic way with all search engines, including future ones that don't exist yet. But that doesn't have to be the case. Most content providers only care about Google.

What I find most interesting is the very clear but often ignored 'Cloaking Guidelines' by google:

"Cloaking refers to the practice of presenting different content or URLs to human users and search engines. Cloaking is considered a violation of Google’s Webmaster Guidelines because it provides our users with different results than they expected.

Some examples of cloaking include:

...

- Inserting text or keywords into a page only when the User-agent requesting the page is a search engine, not a human visitor

" [0]

Google is happily showing LinkedIn, FB, pinterest and news sites content. But when I, Joe User, go to the page, I see nothing but some login/register/pay now form. How is this not a violation of the cloaking guidelines? Clearly google is getting different content than what I am!

(Presumably this is how article's extension works... by masquerading as GoogleBot -- again proving that these sites are serving up different content)

[0] https://support.google.com/webmasters/answer/66355?hl=en

edit: formatting

It’s hardly having their cake and eating it too. It’s the equivalent to wanting placement at a news vendor. Your API idea isn’t a bad one, but I hardly think that these news organizations are at fault for wanting to be discoverable, while also wanting money in exchange for their hard work.