| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ds9 4536 days ago

Google does not reproduce whole articles, only short excerpts to help searchers decide whether it's relevant to what they're looking for - and with clear indication of the source and in a context where it's understood that Google is showing the blurb only to pointing to the source where it was found.

This is technically scraping but it's hardly comparable to the bottom-feeders that plagiarize for money. (Edit: according to 'pud' on this page, Google uses a Wikipedia index so it's not scraping, but it is in the case of other sites that Google indexes.)

And yes, it's OK both legally and ethically if you do the same to Wikipedia - like Google that is, just for indexing purposes and not using whole articles.

3 comments

Silhouette 4536 days ago

Google does not reproduce whole articles, only short excerpts to help searchers decide whether it's relevant to what they're looking for

And what about, for example, Google's image search tool, where the image itself might be what their user is searching for, and where Google controversially changed their system a little while ago to show full-size images in-SERP and de-emphasize forwarding search users to the original source? Or Google Cache, if it's reproducing material that has since been taken down deliberately from the original source?

To add insult to injury, some Google services still appear to rely on the original source's bandwidth to serve things like images (not to mention avoiding a certain legal argument about copyright infringement), thus violating the basic principle of netiquette that has been good manners ever since people actually used the word netiquette that you don't hotlink other people's stuff on your site.

link

josefresco 4536 days ago

You're comparing what Google does to another extreme when you say things like "bottom-feeders that plagiarize for money"

Surely you don't believe that all "scapers" are bottom feeders? It's like saying every criminal is a murderer. There's a whole bunch of grey area in between, and this is where the criticism of Google's harsh penalties is valid.

link

AJ007 4536 days ago

You are at least partially incorrect.

Last year Google was testing reproducing entire Wikipedia articles within their site for their mobile site. You could read the full article within going to Wikipedia (allowed by Creative Commons, of course.) Between that and what they did with Google Images, I would say this reveals intention and is the direction web publishers should expect Google to be headed in.

In order for Google to continue to meet their growth targets they must increase the percentage of outgoing click from free to paid.

link