Hacker News new | ask | show | jobs
by ipsi 979 days ago
I've spun up a copy of this recently (within the last month) and it's already proving helpful.

I've purchased a new-build home in Germany, and I'm currently in the stage between "purchased" and "ready for move-in," and if you've ever purchased a Neubau in Germany you know how much paperwork is involved - I get so many documents over email, many of which are scanned (to preserve the wet signature and stamps), and some of which I need to copy into a translator, that this is incredibly helpful. It checks my email, grabs PDFs, straightens them, OCRs them, adds a correspondent, tags them, and makes them available through a web UI.

I also appreciate the full-text search (for all that it might struggle if I had tens of thousands of documents) as I've had to go and try to find particular documents where the name of the document I've received might be a synonym for what the other person is asking for, but the word they're asking for is at least used in the text.

I'll also set it up to pull documents from my NAS as well, where the scanner writes to, as I also receive a number of documents via mail (that I also occasionally need to translate or copy/paste from).

There are also some limitations that annoy me:

* I really wish the email filters were more flexible - right now, I have to have three filters, one of PDFs, one for JPEGs, and one for PNGs, so I wish I could just set a regex for the attachment name. This one annoys me enough that if I ever have time I'd look at doing a PR for it (assuming the filtering is done locally and not on the IMAP server). * I'd also like to be able to setup rules to tag documents based on the email domain (e.g., house-builders get tagged as "house-builder, house") without having to manage a gigantic explosion of rules. In theory the ML should handle that, but... I'm mistrustful of ML. We'll see in a few months if I was too hasty in my judgement or not. * I'd like to retain slightly more information about the correspondent, like both name and email address (there's no consistency about who has their From line as "Name <email>" and who's just "email", even within the same company), both for de-duplication of correspondents and domain-based searching. * I wish I could share documents more easily than downloading it and re-uploading it to my email client (or mounting the folders and trying to find the right document, but that has its own set of problems). This one of those problems that's really easy to state, but potentially quite difficult to actually implement - could a web application add a PDF to the clipboard in such a way that GMail, say, would understand what was happening and add it as an attachment when pasted?

Overall though, I'm pretty happy with it, and finding it useful so quickly was somewhat surprising.