Hacker News new | ask | show | jobs
by NorwegianDude 929 days ago
I think search engines should be opt-in, not opt-out. Just linking to something is fine, but search engines also takes content and uses it to earn money. One example is the cards on Google where they extract content and shows it alongside ads.

Nothing else I can think of works like this. Generally if you take the work of others without being allowed and use it to make money you'll end up in trouble.

Now, for most people the trade-off is worth it, but it should not be the default. It should be a quick opt-in using robots.txt.

3 comments

> I think search engines should be opt-in, not opt-out

If you put your site on the public internet, you are effectively opting in. In addition, robots.txt is trivially easy to configure if you really don't want your site to be crawled by specific parties' crawlers.

The news corporations absolutely want their sites crawled, to the point that if Google unilaterally stopped crawling their sites they would run to the courts to file lawsuits.

According to this law, "just linking to something" is not fine.
This is such a bizarre way of thinking that's utterly antithetical to how the internet works. If you publish something publicly on the internet, people can see it and link to it. If you don't want that, gate access behind a paywall or login, or don't publish it all. Publishing it publicly and then demanding conditions on how it is accessed is kind of nonsensical.
Generally the problem is not the link itself, it's the headline and any included blurb/summary/image from the article. The problem being that users see & gain the content solely when viewed from the link aggregator, without ever actually visiting the originating website and participating in its monetization methods. A bare "http://..." URL probably would not have raised the same objections, but also no one would actually use a link aggregator that is just bare URLs.

Both sides have a point here. I'm not sure where I end up on the issue myself.

> it's the headline and any included blurb/summary/image from the article.

Why would that be a problem? Open Graph was created by Meta exactly so that these news sites have editorial control over those details. And the Canadian news outlets have adopted Open Graph, so that is not an issue. If they don't like what is shown, they have full control to change it. The headline/blurb/image you see is what they have specified. That is what they want you to see.

The real problem is that reporting the news doesn't pay very well anymore and certain news players were on the hunt for a government subsidy to offset that reality.

> If they don't like what is shown, they have full control to change it.

Sure, and then some other news agency that does include more content in their metadata will be shared in the aggregator instead, because it will at least bring in a little revenue. It's a race to the bottom, and the outcome for the original reporter is the same: no payment for their work.

> The real problem is that reporting the news doesn't pay very well anymore

Well, yeah! That is the problem! Google is getting the money that used to go to the news agencies. This is one attempt at a fix.

> and the outcome for the original reporter is the same: no payment for their work.

What provisions are in this bill to ensure that the money ends up in the reporters' hands, and not the media agencies' hands? Without that then you still have the exact same "race to the bottom" scenario – with the reporter seeing no payment for their work.

What provisions are in this bill to ensure that the reporters are able to create their own brand to break free of the media agencies? If the intent is to see Google/Meta drive the traffic and also pay the reporters, the middleman agency serves no purpose.