| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cookiecaper 4593 days ago

Was that really the consensus among the legal community? Somehow I seriously doubt that anyone familiar with copyright law would assert that your copyright was invalidated by publishing on the web. A weak argument for fair use could be employed for crawling the text portions because only a small (often "insignificant") portion of the work is reproduced in human-readable form, but certainly is not applicable when one discusses crawling images.

Is robots.txt a legally-admissible copyright release? There's probably more room to debate that one, but it's not clear-cut. What does it cover? Is it applicable to all crawlers? Can you do a general license release like that without your work effectively becoming public domain? What's the difference between a crawler and a human reader subject to standard copyright terms? What licenses are implicitly granted in a typical robots.txt? It's not like robots.txt is a verbose document that lays all of this out, and all of it is a potential legal problem point.

Also, Google assumes permission by default and only doesn't scan if you explicitly DISALLOW it with robots.txt. This is the opposite of copyright, which reserves a monopoly to the rightsholder unless he explicitly ALLOWS a certain use. It's undeniable that Google is violating copyright law millions of times each and every day, and that said violation is fundamental to their business.

And does any of this negate the computer access laws that make a site's ToS legally binding, even to those who don't formally agree to them? Strictly interpreted, Google would still be behaving illegally even if the copyright element was taken away.

I agree that there were search engines before Google, and that they mostly were in the same problematic legal situation.

1 comments

ghaff 4593 days ago

And then there's the Internet Archive. IANAL but it seems as if there's effectively this assumption that opt-out makes everything OK even if there's not much if any legal basis to it. (I wrote this piece back in 2005 and not a whole lot has changed: http://bitmason.blogspot.com/2005/07/thoughts-on-wayback-mac...)

To be clear, this is arguably the way that the Web almost has to work--but that doesn't make it all neatly legal under current copyright law.

link