| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mgdlbp 1479 days ago

These days I tend to rely on archive.org (+ archive.today) and a plain link rather than web clipping tools. I just find it more useful coming back later to have access to the full fidelity of a page and its context.

It also feels appropriate to be publicly preserving parts of the web that you've found useful enough to note down. As someone at archive.org said, 'If you see something, save something.'

Edit: n.b. archive.org does honour takedown requests, though they're rare and usually somewhat predictable. cf. gwern's drastic approach of archiving all the pages he's ever visited, https://www.gwern.net/Archiving-URLs

2 comments

choletentent 1479 days ago

Interesting. But how do you keep track of archive links that are relevant to you?

link

mgdlbp 1479 days ago

Hmm? I meant, instead of taking a web clipping or screenshot, I keep only a link to the page in question (and submit it to the archive if it wasn't already crawled). Though I'm still not happy with how I track what part of a page is most relevant--usually it ends up being an ad hoc mix of quoting and outlining. Related: Wasn't there an HN post about highlighting being considered harmful?

I'm also undecided still on whether to always record the original URL or the archive URL when the original page is unlikely to change. Archive links contain the entire original URL, so there's no risk there; on the other hand, it tends to be clear from the date of the note which snapshot to retrieve, so there's not really a need. But dates are close to being metadata and might be lost somehow...

link

mbreese 1479 days ago

I think they meant that instead of saving links to articles directly, they will save references to archive.org links. This is a protection against the host taking down the article for whatever reason.

link

m-p-3 1479 days ago

I personally save them using the SingleFile browser extension.

https://github.com/gildas-lormeau/SingleFile

link

mgdlbp 1479 days ago

There's also ArchiveWeb.page, which records in the same WARC format as archive.org

https://github.com/webrecorder/archiveweb.page

link

kristiandupont 1479 days ago

This is my alternative which tries to convert a page to Markdown: https://github.com/deathau/markdownload

link