Hacker News new | ask | show | jobs
by pud 4490 days ago
Wikipedia's database is public and used by Google with permission. You can probably use it for your projects, too.

So this is neither scraping, nor against the rules.

Here are dumps in SQL and XML format:

http://dumps.wikimedia.org/enwiki/

Ps- Yes the original post was meant to be funny and it was; I do have a sense of humor. :)

1 comments

He's talking about outranking the true original source of the content in search results. You most certainly cannot create your own site that consists only of excerpts from Wikipedia, if you wish to remain on Google's search results. Copyrights are irrelevant to this.

What's bad though, is that Google isn't just lowering the rankings of non-original content pages now (including any kind of legitimate curation sites.) They're marking the entire domains of new curation sites as "pure spam" and de-listing them from Google entirely, and punishing anyone who's linked to them.

This is having the effect of sending a clear message to developers -- stay far away from Google's territory of recommending third party content to people, no matter how you do it.

Could you show an example of such a legitimate new curation site please?
Not new but a legitimate curation site - http://hypem.com/