Hacker News new | ask | show | jobs
by octoberfranklin 1362 days ago
So with stuff like SearX, invidious, nitter, etc you can run your own instance.

Can you run your own instance of archive.ph? Obviously not their software (they don't release it), but is there a version you can run yourself?

The important parts of what archive.ph provides are:

1. Headless chromium that works around 99.9% of the "headless fingerprinter" techniques. Yes, they do actually spin up a full browser engine; that's what sets them apart.

2. Decent curated collection of paywall-bypass tactics.

#2 is definitely available from other sources. So really what we're looking for is a one-click "headless chromium that deals with 99.9% of the fingerprint detectors".

1 comments

It’s not strictly the same but ArchiveBox is very well maintained and recognised for personal archives

https://github.com/ArchiveBox/ArchiveBox

Just a few hours ago I tried ArchiveBox again. Used the docker-compose version, imported around 600 bookmarks and... 5 hours later everything is pending, it has stopped creating new files and folders in the data folder, and I have no idea how to find out what's wrong.

I'm not terribly motivated right now to dive into the community and ask for help, so I won't seriously complain, but it's at least not trivial to use.

Clueless here. What sort of “personal archiving” are you talking about? Like Evernote? I’ve only ever used archive.ph to facilitate sharing articles without worrying about links changing (looking at you, BBC).
For example if you want to save a web page you're browsing, or a whole website, in a way that is a bit more reliable than hitting ctrl+s; and then want to search through those etc.

Web archival is a very, very complex and deep topic. If you're interested in it, the archiveteam's wiki is a good resource. Start here, I think:

https://wiki.archiveteam.org/index.php/Introduction

https://wiki.archiveteam.org/index.php/Software

https://wiki.archiveteam.org/index.php/The_WARC_Ecosystem