| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by StavrosK 5745 days ago
	Then we could sue Google for copyright infringement for caching our pages, I guess... Why would we not allow Google to cache it? Each cached page has a great big box on top saying that this is the historious version of the cache and linking to the original site... Example: http://cache.historious.net/cached/515865/

2 comments

sounddust 5745 days ago

There's a huge difference between what Google is doing and what you're doing.

1) Google is caching pages for a specific purpose and ensuring that they aren't cached/scraped by others:

http://webcache.googleusercontent.com/robots.txt

By not excluding robots, you're opening yourself to all kinds of situations where you are responsible for draining revenue from the owner of the content, which leaves you liable to lawsuits. By contrast, the way that Google caches content and their rules surrounding it do not generally harm the copyright owner.

2) Google honors all robots.txt, no-archive meta-tags, and other indications that the author doesn't want the page to be cached. Is historious doing the same?

link

StavrosK 5745 days ago

1) We do exclude robots now, yes. 2) historious doesn't spider websites, it only saves the pages the users give us. It's the same as a user deciding to make a backup of a webpage on their computer...

link

carbocation 5745 days ago

"It's the same as a user deciding to make a backup of a webpage on their computer..."

... and then publishing it on the Internet.

(This is not meant to be snarky or to imply opposition to your product at all. I think there is a meaningful difference between saving to a computer and saving to a web-accessible, apparently globally readable website.

link

StavrosK 5745 days ago

Isn't it a users responsibility to obey copyright restrictions in this case, given that we never publish content unless the user does it? It's basically the same situation as hosting a website, if you upload and publish a copyrighted page, is the host responsible?

link

sounddust 5745 days ago

In my opinion, those two cases are not similar. I doubt that this type of automatic caching/publishing would have any protection under the DMCA safe-harbor laws unless you're making it clear to users what they're doing (I'm not a user of the service, so maybe you already are).

If I understand correctly, the users of your site are simply bookmarking pages. You are then caching it, storing it, and publishing it with a world-readable URL. There are many ways that you could provide the same experience to the user without making the cached page publicly accessible.

If you were to give users the option to make specific bookmarks world-readable - and you provided a disclaimer explaining that they should not make copyrighted material world-readable - then it might be different. But that's probably something you should discuss with an attorney.

link

StavrosK 5745 days ago

Ah, no, our users cache pages, but if they want the cache world-readable, they need to explicitly click the "publish" link.

Thank you for the information, I'll talk to our lawyer about it just to be safe.

link

pbhjpbhj 5745 days ago

If someone uses your service to republish a few dozen News Corp pages, then sends them the link I reckon you'll be in court before sunset.

Edit: I think it's a great idea though to save bookmarked content, just not to republish it without permission.

link

bananaandapple 5745 days ago

Indeed, what google is doing (caching a page and showing it to the user) is copyright infrigment in some countries. (e.g Belgium, ...).

There hasn't been any case against them but theoriticaly someone could sue them. Who will win is a different story.

link

StavrosK 5745 days ago

Hmm, that's interesting... Another difference is that google is doing it by itself, whereas historious only stores pages that users specify and only publishes them when the user specifies it.

We'll have a chat with our lawyer regardless, thank you!

link

pbhjpbhj 5745 days ago

Google honour robots.txt.

link