| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by harperlee 3740 days ago
	So what is the legality of this? Apart from the risk of having someone pull the plug on the way one takes the information out, when is something without a proper license able to be used?

2 comments

dsp1234 3740 days ago

In the US, there is no copyright protection for "facts" on their own. However, a compilation/database of facts can have copyright protections based on a 3 part test[0].

    1. the collection and assembly of pre-existing material, facts, or  data;
    2. the selection, coordination, or arrangement of those materials; and
    3. the creation, by virtue of the particular selection, coordination, or arrangement of an original work of authorship.

But specifically there is no protection for the underlying facts themselves, and there is no "sweat of the brow" doctrine. So scraping the data, and rearranging the underlying facts into your own arrangement/organization is almost always not copyright infringement. However, if that data is categorized in some non-trivial way, and you keep that organization, then that is likely to be copyright infringement.

However, if what you're scraping are not "facts", but some creative works, such as blog posts, product descriptions, etc, then it is likely to be copyright infringement.

Then on top of that, even if there is copyright infringement, other defenses such as a license to use the data, or fair use may apply.

[0] - http://www.pddoc.com/copyright/compilation.htm

link

toomuchtodo 3740 days ago

> So scraping the data, and rearranging the underlying facts into your own arrangement/organization is almost always not copyright infringement.

I'm not so sure. It would definitely be illegal in the US for me to cherry pick data out of Google Maps and add it to OpenStreetMap (and OSM has policies addressing exactly this).

link

iolothebard 3740 days ago

Yet companies like LexisNexis get most their data they resell this way.

link

toomuchtodo 3740 days ago

Are they scraping copyrighted data? Or public records? Big difference.

link

iheartmemcache 3739 days ago

No one in the US can hold copyrights to the pure 'facts', especially if one demonstrates they invested enough energy to 'creatively reinterpret' it. Scraping hasn't quite seen a Supreme Court ruling yet (@grellas correct me, please), but I'm sure one could make a reasonable argument that the energy invested in re-collating the data is sufficient enough to pass any barrier. See Feist Publications, Inc., v. Rural Telephone Service Co, 1991. and O'Connors opinion.

link

iolothebard 3739 days ago

Facts aren't copyrightable.

They scrape everything in the world they can get their hands on.

link

toomuchtodo 3739 days ago

Collections of facts are: https://www.unc.edu/courses/2006spring/law/357c/001/projects...

link

ap22213 3739 days ago

What part of the law does this fall under? Do people get arrested for this? (i.e. criminal) What's the worst that can happen?

link

toomuchtodo 3739 days ago

https://en.wikipedia.org/wiki/Copyright_infringement

http://www.copyright.gov/title17/92chap5.html#501

https://www.law.cornell.edu/uscode/text/17/chapter-5

https://www.lib.purdue.edu/uco/CopyrightBasics/penalties.htm...

link

pimlottc 3739 days ago

That's begging the question of whether Google's data on public streets is actually protected by copyright under U.S. law.

link

toomuchtodo 3739 days ago

https://www.google.com/permissions/geoguidelines.html

link

lazyjones 3740 days ago

IANAL but in the EU at least, even databases comprised of simple "facts" are protected.

It's a sad state of affairs when i'm not even allowed to scrape data generated using taxpayers' money, like the (required by EU laws) noise maps for cities, which I'd like to use to augment real estate offers, for example.

link

Symbiote 3739 days ago

"Europe" would like to partially fund that noise database with income from businesses that use it. The result is less taxpayer money us needed.

I think it's only the UK that has copyrightable fact databases

link

techdragon 3739 days ago

Except that doesn't happen because the last thing a new business idea needs is more red tape, paperwork and expenditure.

link