|
|
|
Ask HN: How to organize archived webpages locally?
|
|
7 points
by linuxfan2718
1115 days ago
|
|
I've been going through 100's of bookmarks I made over the years, all carefully tagged and organized, but a lot of the pages are taken down. I want to start archiving them locally, probably using Firefox's "Save Page As..." feature. Do people here do this, how do you organize and tag them? Folders aren't perfect because some pages deserve multiple tags. |
|
Since your bookmarks are already tagged, perhaps you don't need to tag the files? In some ways, it may be convenient, but at the cost of duplicating the information. As long as you can map a bookmarked URL to a file path or paths, you can find archived copies through your bookmarks.
Here is what I do for external URLs on my personal website. It is inspired by Gwern's approach. A major difference is that he doesn't nest directories; he uses ${domain}/${url-checksum}.ext.
I translate the URL to a file path in my link-archive directory by applying the function dest-dir from the Tcl code below. In the directory, I save whatever is at the URL with a name based on its checksum (b2sum -l 32), so I can have multiple archived copies of the same URL. I use https://github.com/gildas-lormeau/single-file-cli to save the URL. I determine the destination file extension from the MIME type.
This gives you paths like link-archive/365tomorrows.com/2005/10/23/postcard/e5445dff.html for https://365tomorrows.com/2005/10/23/postcard/.