Hacker News new | ask | show | jobs
by mgdlbp 1479 days ago
These days I tend to rely on archive.org (+ archive.today) and a plain link rather than web clipping tools. I just find it more useful coming back later to have access to the full fidelity of a page and its context.

It also feels appropriate to be publicly preserving parts of the web that you've found useful enough to note down. As someone at archive.org said, 'If you see something, save something.'

Edit: n.b. archive.org does honour takedown requests, though they're rare and usually somewhat predictable. cf. gwern's drastic approach of archiving all the pages he's ever visited, https://www.gwern.net/Archiving-URLs

2 comments

Interesting. But how do you keep track of archive links that are relevant to you?
Hmm? I meant, instead of taking a web clipping or screenshot, I keep only a link to the page in question (and submit it to the archive if it wasn't already crawled). Though I'm still not happy with how I track what part of a page is most relevant--usually it ends up being an ad hoc mix of quoting and outlining. Related: Wasn't there an HN post about highlighting being considered harmful?

I'm also undecided still on whether to always record the original URL or the archive URL when the original page is unlikely to change. Archive links contain the entire original URL, so there's no risk there; on the other hand, it tends to be clear from the date of the note which snapshot to retrieve, so there's not really a need. But dates are close to being metadata and might be lost somehow...

I think they meant that instead of saving links to articles directly, they will save references to archive.org links. This is a protection against the host taking down the article for whatever reason.
I personally save them using the SingleFile browser extension.

https://github.com/gildas-lormeau/SingleFile

There's also ArchiveWeb.page, which records in the same WARC format as archive.org

https://github.com/webrecorder/archiveweb.page

This is my alternative which tries to convert a page to Markdown: https://github.com/deathau/markdownload