Hacker News new | ask | show | jobs
by dalf 2771 days ago
What about RSS ? Many news sites have a RSS feed : * http://www.spiegel.de/schlagzeilen/tops/index.rss * https://www.lemonde.fr/rss/une.xml

So as an individual I can use them, but the dead Google Reader would have meet the same issue than Google News ?

2 comments

The publisher can control how much content is exposed via RSS (typically just the lede), whereas with presenting scraped content by third party news aggregators, the user will never need to visit the origin site.
The publisher can also control how much is shared with third party aggregators, either through robots.txt or a paywall method.

Which has been the case since search engines became a thing.

That isn't the same at all. A publisher cannot use robots.txt, and much less paywalls, to indicate a part of text that can be shared in syndication.
A paywall can. The page displays the snippet the publication is allowing to be shared, while the paywall hides the rest. I believe this is what a few of the bigger US newspapers are doing right now.
Ok, but that would require regular readers to have credentials for the paywall. I understood the discussion to be about scraping publicly accessible sites.
I think the issue depends on whether a service is acting as a principal or an agent. If a user signs up for your service and says "I would like to subscribe to Mox News" and you pull data on behalf of your user then I see no issue.

But in the same way you as an individual couldn't republish those copyrighted works, your aggregation service where you choose from sources and publish your links and summaries wouldn't be okay.