Hacker News new | ask | show | jobs
by TekMol 1434 days ago
I never understood how the IA can get away with copying all those websites and all their content as if copyright did not exist.

Can anybody enlighten me how they have not been sued into oblivion and sit in prison already?

8 comments

Browsers copy and store websites as part of their normal functioning. If you didn't want your website to be copied and stored then maybe it was better not to put it up in the first place? Anyway the IA will remove everything with a very simple, automated text file placed in the root directory.
> Anyway the IA will remove everything with a very simple, automated text file placed in the root directory.

What if the site is simply gone, or now belongs to someone else who is not the owner of the archived content?

Then you can follow these steps:

https://www.joshualowcock.com/guide/how-to-delete-your-site-...

You'll need to prove that you are the owner of the archived content, or were the owner of the domain.

Browsers can copy and store, but republishing is a totally different matter.
So you say copyright does not apply to websites?
They are saying that your overly theoretic application of copyright is in absolute contrast to the technical realities of how the web works.

It's a discussion that's been had for literally decades, because most tech-fluent people realized a long time ago how a copyright that's designed for physical distribution does not lend itself well to the intangible nature of the web, were replication is trivial and in many cases a mandatory necessity to enable a lot of functions in the very first place.

Sadly that discussion simply died out at some point, I think it was around 2010 when smartphones and social media started to boom, so the copyright reform that was supposed to "fix" all this never came.

Where did I say that? By putting up the website you did, however, give implied permission to use it in ways which are fundamental to how the web works, otherwise why did you put the website up?
There are special rules for archives that might help: https://www.copyright.gov/title17/92chap1.html#108

Not sure if that's the specific part that lets them do what they do, or if that's from some other rule, just pointing out that this kind of rule exists.

They have a few special permissions from the US federal government which certainly doesn’t hurt when it comes to archival efforts
If you ask to have something removed, or to exclude your site, they do. They comply with robots.txt, I believe even retrospectively. I think they try hard NOT to be sued.

They are also not making a profit from “copied” content, and so damages would be small. Particularly as they would immediately remove the problematic content.

Basically (until now), they make themselves as reasonable and as little of a target as they can.
Copyright is not absolute, it has some exemptions like fair use, historic preservation and education. IANAL but I understand they work fall in at least one of these categories.
To me it seems pretty reasonable. That content is available on public internet. And they even give perfect citations. That is timestamp and address, plus all the information on the page.

If people don't want their content in such place they can always place it behind login-wall.

I had listened and downloaded full music albums from the Archive. Don't know why is possible.
Or outright warez content... Yeah, I don't know why anyone would think that should pass...
Google does the same and profits from it.