Hacker News new | ask | show | jobs
by danielbarla 4489 days ago
That's one way of looking at it, on the other hand, they link to the original URL, passing traffic back to the original source. Most "scraper" sites take the content, wrap it in their own similar outer layer, and try to take ad revenue. E.g. I've seen my own StackOverflow answers copied, word for word, to a scraper site and presented under a made-up name.
3 comments

StackOverflow actually allows this; all their data is Creative Commons licensed, and they publish the full database dump on the Internet Archive.

https://archive.org/details/stackexchange

Do the terms of the license allow for this kind of abuse?

Just because something is CC doesn't mean you can do whatever you want with it.

Yes, they do; it's not abuse when you're given explicit permission. CC BY-SA means you can do whatever you want with it as long as you attribute the source as specified.
"as long as you attribute the source"

danielbarla said that they presented the material under a false name; this goes beyond copying and becomes plagiarism, which I can't imagine is an intended result of the CC license.

Is the source 'User X' or 'StackOverflow'? When you reference CC BY-SA code you don't reference the people who, say, checked it into git but rather the whole repo.
CC BY-SA is short for Creative Commons Attribution Share-Alike. BY means you must attribute, and SA means you must license any distributed derivative works under the same license (copyleft). Attribution on its own is not enough.
No, attribution is required.
Interesting, from the file sizes you can quickly gauge the relative popularity of each subject.
By having a tl;dr about the actual Wikipedia page, there is no need for the user to click on the link. Following what you're saying, Google as wrapped it in their own layer, and trying to take ad revenue.
Actually, I find that having a tl;dr will rarely answer the question(s) I have on a topic, but it will commonly show me whether I've found the right wikipedia page. I usually either click-through or refine my search.
They don't actually link to the wikipedia URL. They mask a link that leads to another Google page "/url?sa=t&rct=j&q=&...." which in turn responds with a 200 OK page that redirects to Wikipedia.

Sure it passes the keywords etc. But this likely reduces the number of people visiting Wikipedia, while increasing Google's ad revenues, if anyone but Google did this they'd be potential blacklisted by Google.

Actually, they do link to the wikipedia URL.

href="http://en.wikipedia.org/wiki/Scraper_site" appears directly in the source code of that web page.

It also has an onmousedown handler that rewrites the URL to point at Google, so they can tell which link you clicked, to improve their ranking system. And Google works very closely with sites to make sure the sites know how to understand the referrals.