| What I really want is a script that (1) Takes a URL and optional comment as input (2) Saves the webpage it points to into a git repo (a simple curl should suffice for most websites) (3) Inserts that URL, title of the page pointed-to by the URL and the optional comment into an org-mode file that lives in the root of the repo The org-mode file is a highly-searchable and context-preserving database (I can add tags, create hierarchies, add links to and from other relevant (org-mode or not) files) in the most portable format ever: plain text. I really don't need a web interface. Actually, if I later decide that I need one, I can build one easily on top of this basic system. I really want to be able to use this across multiple devices: mainly my two computers, and an Android phone. Using git gives me a reliable protocol for syncing between multiple devices. I want it to be a smooth experience on my phone, which would probably require some sort of git-aware app. Something similar to the Android client for the pass password manager would be ideal. I hear that git repos can be GPG-encrypted. Ideally, I'm able to serve all this off of a repo hosted on a VPS. I don't want to rely on Dropbox (I'm trying to transition away from it) for syncing. |
FWIW I've done something similar and lots of sites that use a lot of JS (and pretty much every single page webpage like twitter and FB) will not re-render correctly just because you have the files. It actually takes a lot of work to clone a webpage, the best solution I've found so far is to print a PDF from a headless chrome (but this has its own problems, like now you have to deal with a PDF).
Even generating the PDF is a lot harder than it seems, at least if you've never done it before, because there are a lot of gotchas (for example, did you know that most websites provide a second stylesheet to be used while printing that makes it look barely messed up, but still clearly broken? I didn't either)